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Abstract 

This  thesis  addresses  several  resource  allocation  problems  that  arise  in  the  context  of  dis¬ 
tributed  networks.  First,  we  present  a  scheme  for  accessing  shared  copies  of  objects  in 
a  network  that  has  as5moiptotically  optimal  expected  cost  per  access  for  a  class  of  cost 
functions  that  captures  the  hierarchical  structure  of  most  wide-area  networks.  Second,  we 
present  an  off-line  polynomial-time  algorithm  that  finds  an  asymptotically  optimal  schedule 
for  the  movement  of  packets  whose  paths  through  a  network  have  already  been  determined. 
This  is  an  improvement  on  a  previous  result  by  Leighton,  Maggs,  and  Rao,  who  proved  the 
existence  of  such  schedules;  their  proof,  however,  was  not  constructive.  Finally  we  present 
a  polynomial-time  (9(log  n) -approximation  algorithm  for  finding  an  embedding  of  a  net¬ 
work  with  n  processors  into  an  n-node  linear  array  so  as  to  minimize  the  weighted  sum 
of  the  edge  dilations  —  i.e.,  for  the  minimum  linear  arrangement  problem.  This  problem 
is  NP-hard,  and  the  previous  best  approximation  bound  known  was  (9(log  n  log  log  n).  In 
the  case  of  planar  networks,  we  bring  the  approximation  factor  down  to  0(log  log  n).  We 
also  extend  our  approximation  techniques  to  the  minimum  storage-time  product  and  the 
minimum  containing  interval  graph  problem. 
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Chapter  1 
Introduction 


The  advent  of  high-speed  distributed  networks  has  made  it  feasible  for  a  large  number 
of  geographically  dispersed  computers  to  cooperate  and  share  information  (e.g,  messages, 
files,  control  data).  Indeed,  the  last  few  years  have  seen  the  emergence  of  large  distributed 
databases,  such  as  the  World  IMde  Web,  and  more  generally,  of  a  variety  of  distributed 
network  applications  that  rely  on  conununication  for  performing  their  basic  tasks.  The  dis¬ 
tributed  nature  of  the  databases  and  the  rapidly  growing  demands  of  the  users  have  in  turn 
overloaded  the  underlying  network  resources  (e.g.,  links,  memory  space  at  the  processors, 
buffer  space  at  the  links  and  processors). 

In  an  attempt  to  minimize  communication  delays  and  to  satisfy  as  many  users  as  pos¬ 
sible,  strategies  for  making  efficient  use  of  network  resources  have  been  devised.  As,  for 
example,  in  this  thesis,  where  efficient  resource  allocation  strategies  are  used  to  obtain 
efficient  solutions  to  three  problems  that  arise  in  the  context  of  distributed  networks. 

The  first  problem  we  consider  in  this  thesis  is  the  one  of  efficiently  supporting  requests 
for  shared  objects  (e.g.,  files,  pages  of  memory)  that  have  been  distributed  among  the  pro¬ 
cessors  (nodes)  of  a  wide-area  network.  Multiple  copies  of  each  object  may  exist  in  the 
network.  In  particular,  we  would  like  to  devise  a  protocol  that  satisfies  each  request  for  an 
object  with  a  “nearby”  copy  of  the  object,  since  this  ensures  fast  response  times  and  mini¬ 
mizes  the  cost  incurred  in  accessing  the  object.  Chapter  2  presents  a  protocol  that  achieves 
asymptotically  optimal  expected  cost  for  satisfying  a  request  for  an  object  while  making 
efficient  use  of  the  memory  space  at  each  node,  for  a  class  of  cost  functions  that  captures 
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the  hierarchical  structure  of  most  wide-area  networks. 

Next,  we  study  the  movement  of  packets  in  a  network:  More  specifically,  we  consider 
the  problem  of  scheduling  the  movement  of  packets  whose  edge-simple^  paths  through  a 
network  have  already  been  determined.  This  problem  arises  in  a  scenario  where  nodes  of 
the  network  exchange  information  via  point-to-point  communication  paths.  In  Chapter  3 
of  this  thesis,  we  present  a  polynomial-time  algorithm  that  finds  an  asymptotically  optimal 
schedule  for  routing  the  packets  along  the  given  paths.  This  is  an  improvement  on  a  previ¬ 
ous  result  by  Leighton,  Maggs,  and  Rao  [26],  who  proved  the  existence  of  such  schedules; 
their  proof,  however,  was  not  constructive. 

The  problem  of  scheduling  the  movement  of  packets  in  the  network  relates  to  network 
emulations  that  are  performed  via  network  embeddings,  as  we  will  see.  Network  emula¬ 
tions  and  embeddings  will  also  be  addressed  in  Chapter  4,  where  we  consider  embeddings 
of  networks  into  the  linear  array.  Thus  we  briefly  discuss  emulations  and  embeddings  in 
the  paragraphs  that  follow.  For  a  more  complete  discussion  of  emulations  and  embeddings, 
see  [24];  see  also  [32]. 

We  can  model  the  topology  of  a  network  as  a  graph  G(V,  E),  where  each  node  in  V 
uniquely  represents  a  processor  of  the  network,  and  where  each  edge  (a*,  y)  in  E  uniquely 
represents  a  communication  link  between  the  processors  corresponding  to  nodes  x  and  y 
in  the  network.  Throughout  this  thesis,  we  will  implicitly  use  this  model,  interchangeably 
referring  to  a  network  as  a  graph,  and  to  processors  and  links  of  the  network  as  nodes  and 
edges  respectively. 

A  guest  network  G  can  be  emulated  by  a  host  network  H  by  embedding  G  into  H.  An 
embedding  maps  nodes  of  G  to  nodes  of  H,  and  edges  of  G  to  paths  in  //  —  an  edge  (a:,  y) 
of  G  is  mapped  to  some  path  in  H  between  the  nodes  of  H  that  x  and  y  were  mapped  to. 
There  are  three  important  measures  of  an  embedding:  the  load,  congestion,  and  dilation. 
The  load  of  an  embedding  is  the  maximum  number  of  nodes  of  G  that  are  mapped  to 
any  one  node  of  H.  The  congestion  of  an  embedding  is  the  maximum  number  of  paths 
corresponding  to  edges  of  G  that  use  any  one  edge  of  H.  The  dilation  of  an  embedding 


^  An  edge-simple  path  uses  no  edge  (i.e.,  link  of  the  network)  more  than  once. 
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is  the  length  of  the  longest  path  of  H  in  the  embedding.  Let  I,  c,  and  d  denote  the  load, 
congestion,  and  dilation  of  the  embedding,  respectively. 

Once  G  has  been  embedded  in  H,  H  can  emulate  G  in  a  step-by-step  fashion  as  follows. 
Each  node  of  H  first  emulates  the  local  computations  performed  by  the  /  (or  fewer)  nodes 
mapped  to  it.  This  takes  0{l)  time.  Then  for  each  packet  sent  along  an  edge  of  G,  H  sends 
a  packet  along  the  corresponding  path  in  the  embedding.  Using  the  algorithm  presented  in 
Chapter  3,  //  can  emulate  each  step  of  G  in  0{l  +  c  +  d)  steps. 

We  address  a  problem  that  relates  to  embeddings  of  networks  into  the  linear  array  in 
Chapter  4.  Suppose  a  network  G  with  n  nodes  is  embedded  one-to-one  (with  respect  to  the 
mapping  of  its  nodes)  into  a  network  H.  The  dilation  of  an  edge  of  G  in  the  embedding 
is  the  length  of  the  path  of  H  that  this  edge  is  mapped  to.  We  would  like  to  be  able  to 
minimize  the  average  edge  dilation  of  the  embedding,  since  high  average  dilation  may 
incur  high  average  cost  of  communication.  Unfortunately,  the  problem  of  determining  if 
there  exists  an  embedding  with  average  edge  dilation  d',  for  any  d'  >  0,  is  NP-hard  even 
for  the  case  when  the  host  network  is  a  linear  array.  A  generalization  of  this  problem  is 
to  assign  nonnegative  weights  to  each  edge  of  G,  which  may  represent  the  amount  (or  the 
cost)  of  communication  through  that  edge;  in  this  case,  we  would  like  to  minimize  the 
average  weighted  edge  dilation  of  the  embedding. 

In  Chapter  4,  we  present  a  polynomial-time  C>(log  n) -approximation  algorithm  for  find¬ 
ing  a  one-to-one  embedding  of  a  graph  with  n  nodes  into  the  n-node  linear  array  so  as 
to  minimize  the  weighted  sum  of  the  edge  dilations.  An  embedding  that  has  minimum 
weighted  sum  of  edge  dilations  is  called  a  minimum  linear  arrangement.  If  the  network  is 
a  planar  graph,  we  obtain  an  improved  approximation  factor  of  C>(log  log  n). 

We  conclude  Chapter  4  by  extending  the  ideas  used  for  approximating  the  minimum 
linear  arrangement  problem  to  obtain  0(log  n) -approximation  algorithms  for  two  other 
problems  that  involve  finding  a  linear  ordering  of  the  nodes  of  a  graph:  the  problems  of 
finding  a  minimum  storage-time  product,  and  of  finding  a  minimum  cost  containing  inter¬ 
val  graph  of  a  given  graph.  For  the  latter  problem,  in  case  the  input  graph  is  planar,  we  also 
obtain  an  improved  approximation  bound  of  (9(log  log  n). 

Each  of  the  following  chapters  is  self-contained.  Concluding  remarks  and  suggestions 


4 


CHAPTER  1.  INTRODUCTION 


of  future  work  regarding  any  of  the  problems  considered  will  be  presented  at  the  end  of  the 
relevant  chapter. 

1.1  Notation 

In  this  section  we  introduce  some  basic  notation  that  will  be  used  in  this  thesis.  Throughout 
this  thesis,  for  any  positive  integer  x,  we  use  [a-]  to  denote  the  set  {0, . . . ,  a;  -  1};  and  for 
any  integers  a  and  b,  we  let  [a,  b]  denote  the  set  {A-  €  Z :  c  <  A;  <  6}.  Also,  all  logarithms 
are  to  base  2,  unless  otherwise  specified. 

In  the  context  of  randomized  algorithms  (Chapters  2  and  3),  we  use  “with  high  proba¬ 
bility”  to  mean  “with  probability  at  least  1  —  n~^,  where  n  is  the  number  of  nodes  in  the 
network  and  c  is  a  constant  that  can  be  set  arbitrarily  large  by  appropriately  adjusting  other 
constants  defined  within  the  relevant  context.” 


Chapter  2 


Accessing  Nearby  Copies  of  Replicated 
Objects  in  a  Distributed  Environment 


2.1  Introduction 


As  one  might  expect,  the  task  of  designing  efficient  algorithms  for  supportirig  access  to 
shared  objects  (e.g.,  files,  pages  of  memory)  over  wide-area  networks  is  challenging,  both 
from  a  practical  as  well  as  a  theoretical  perspective.  With  respect  to  any  interesting  mea¬ 
sure  of  performance  (e.g.,  latency,  throughput),  the  optimal  bound  achievable  by  a  given 
network  is  a  complex  function  of  many  parameters,  including  edge  delays,  edge  capaci¬ 
ties,  buffer  space,  communication  overhead,  and  patterns  of  user  communication.  Ideally, 
we  would  like  to  take  all  of  these  factors  into  account  when  optimizing  performance  with 
respect  to  a  given  measure.  However,  such  a  task  may  not  be  feasible  in  general  because 
the  many  network  parameters  interact  in  a  complex  manner.  For  this  reason,  we  adopt  a 
simplified  cost  model  for  a  network,  in  which  the  combined  effect  of  the  detailed  network 
parameter  values  is  assumed  to  be  captured  by  a  single  function  that  specifies  the  cost  of 
communicating  a  fixed-length  message  between  any  given  pair  of  nodes.  We  anticipate 
that  analyzing  algorithms  under  this  model  will  significantly  aid  in  the  design  of  practical 
algorithms  for  modem  distributed  networks. 

This  is  joint  work  with  Greg  Plaxton,  University  of  Texas  at  Austin,  and  Rajmohan  Rajaraman,  DIMACS; 
a  preliminary  version  of  this  work  appears  in  [41]. 
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Accessing  shared  objects.  Consider  a  set  A  of  m  objects  being  shared  by  a  network 
G,  where  several  copies  of  each  object  may  exist.  In  this  paper,  we  consider  the  basic 
problem  of  reading  objects  in  A.  Motivated  by  the  need  for  efficient  network  utilization,  we 
seek  algorithms  that  minimize  the  cost  of  the  read  operation.  We  do  not  address  the  write 
operation,  which  involves  the  additional  consideration  of  maintaining  consistency  among 
the  various  copies  of  each  object.  The  problem  of  consistency,  although  an  important  one, 
is  separate  from  our  main  concern,  namely,  that  of  studying  locality.  Our  results  for  the 
read  operation  apply  for  the  write  operation  in  scenarios  where  consistency  either  is  not 
required  or  is  enforced  by  an  independent  mechanism. 

We  differentiate  between  shared  and  unshared  copies  of  objects.  A  copy  is  shared  if 
any  node  can  read  this  copy;  it  is  unshared  if  only  the  node  that  holds  the  copy  may  read  it. 
We  say  that  a  node  u  inserts  (resp.,  deletes)  a  copy  of  object  A  (that  u  holds)  if  u  declares 
the  copy  shared  (resp.,  unshared). 

We  refer  to  the  set  of  algorithms  for  read,  insert,  and  delete  operations  as  an  access 
scheme.  Any  access  scheme  that  efficiently  supports  these  operations  incurs  an  overhead 
in  memory.  It  is  desirable  that  this  overhead  be  small,  not  only  because  of  space  consid¬ 
erations,  but  also  because  low  overhead  usually  implies  fast  adaptability  to  changes  in  the 
network  topology  or  in  the  set  of  object  copies. 

The  main  source  of  difficulty  in  designing  an  access  scheme  that  is  efficient  with  respect 
to  both  time  and  space  is  the  competing  considerations  of  these  measures.  For  example, 
consider  an  access  scheme  in  which  each  node  stores  the  location  of  the  closest  copy  of  each 
object  in  the  network.  This  allows  very  fast  read  operations  since  a  node  knows  the  location 
of  the  closest  copy  of  any  desired  object.  However,  such  an  access  scheme  is  impractical 
because  it  incurs  a  prohibitively  large  memory  overhead  (proportional  to  the  number  of 
objects  in  the  network),  and  every  node  of  the  network  may  have  to  be  informed  whenever 
a  copy  of  an  object  is  inserted  or  deleted.  At  the  other  extreme,  one  might  consider  an 
access  scheme  using  no  additional  memory.  In  this  case  insert  and  delete  operations  are 
fast,  but  read  operations  are  costly  since  it  may  be  necessary  to  search  the  entire  network 
in  order  to  locate  a  copy  of  some  desired  object. 

Our  access  scheme.  We  design  a  simple  randomized  access  scheme  that  exploits  lo- 
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cality  and  distributes  control  information  to  achieve  low  overhead  in  memory.  The  central 
part  of  our  access  scheme  is  a  mechanism  to  maintain  and  locate  the  addresses  of  copies  of 
objects.  For  a  single  object,  say  A,  we  can  provide  such  a  mechanism  by  the  following  ap¬ 
proach.  We  embed  an  n-node  “virtual”  height-balanced  tree  T  one-to-one  into  the  network. 
Each  node  u  of  the  network  maintains  information  associated  with  the  copies  of  A  residing 
in  the  set  of  nodes  that  form  the  subtree  of  T  rooted  at  u.  Given  the  embedding  of  T,  the 
read  operation  may  be  easily  defined  as  follows.  When  a  node  u  attempts  to  read  A,  u  first 
checks  its  local  memory  for  a  copy  of  A  or  information  about  copies  of  A  in  the  subtree  of 
T  rooted  at  u.  If  this  local  check  is  unsuccessful,  u  forwards  the  request  for  object  A  to  its 
parent. 

Naive  extensions  of  the  above  approach  to  account  for  all  objects  require  significant 
overhead  in  memory  for  control  information  at  individual  nodes.  We  overcome  this  prob¬ 
lem  by  designing  a  novel  method  to  embed  the  different  trees  associated  with  different 
objects.  Our  embedding  enables  us  to  define  simple  algorithms  for  the  read,  insert,  and 
delete  operations,  and  to  prove  their  efficiency  for  a  class  of  cost  functions  that  is  appropri¬ 
ate  for  modeling  wide-area  networks. 

One  important  property  of  our  access  scheme  is  that  it  does  not  require  location  de¬ 
pendent  naming  of  the  copies  of  the  objects,  as  we  will  see.  Thus  it  avoids  renaming  a 
copy  of  an  object  every  time  this  copy  migrates  (i.e.,  moves  to  another  location  in  the  net¬ 
work).  Location  dependent  naming  also  poses  a  problem  when  keeping  track  of  replicated 
objects,  since  copies  of  the  same  object  located  at  different  addresses  in  the  network  will 
have  different  names.  A  distributed  shared  object  may  be  replicated  in  order  to  improve 
fault-tolerance  or  performance,  for  example.  Other  important  properties  of  our  scheme  for 
the  restricted  class  of  cost  functions  considered  are  that  (i)  for  its  distribution  of  control 
information  and  of  shared  data,  our  scheme  is  expected  to  avoid  “hot-spots”  in  the  network 
(i.e.,  heavily  accessed  nodes);  and  (ii)  for  its  distribution  of  data,  combined  with  its  sup¬ 
port  for  object  replication,  and  fast  adaptability  to  changes  in  the  network,  our  scheme  is 
expected  to  scale  well.  Scalability  is  one  of  the  most  important  problems  to  be  solved  in 
today’s  large-scale  networks  —  for  example,  the  World  Wide  Web,  in  spite  of  using  scal¬ 
able  components  (e.g.,  clients,  servers,  TCP/IP  connections,  DNS),  has  serious  problems 
of  scalability  as  a  whole. 
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The  cost  model.  As  indicated  above,  we  assume  that  a  given  function  determines  the 
cost  of  communication  between  each  pair  of  nodes  in  the  network.  Our  analysis  is  geared 
towards  a  restrictive  class  of  cost  functions  which  we  believe  to  be  of  practical  interest. 
The  precise  set  of  assumptions  that  we  make  with  respect  to  the  cost  function  is  stated  in 
Section  2.2.  Our  primary  assumption  is  that  for  all  nodes  x  and  costs  r,  the  ratio  of  the 
number  of  nodes  within  cost  2r  of  node  x  to  the  number  of  nodes  within  cost  r  of  node  x 
is  bounded  from  above  and  below  by  constants  greater  than  1  (unless  the  entire  network  is 
within  cost  2r  of  node  x,  in  which  case  the  ratio  may  be  as  low  as  1). 

There  are  several  important  observations  we  can  make  concerning  this  primary  assump¬ 
tion  on  the  cost  function.  First,  a  number  of  commonly  studied  fixed-connection  network 
families  lead  naturally  to  cost  functions  satisfying  this  assumption.  For  example,  fixed- 
dimension  meshes  satisfy  this  assumption  if  the  cost  of  communication  between  two  nodes 
is  defined  as  the  minimum  number  of  hops  between  them;  constant  degree  trees  satisfy 
this  assumption  if  the  cost  of  communication  between  two  nodes  is  given  by  the  distance 
between  these  nodes  in  the  physical  layout  (e.g.,  a  wide-area  layout,  or  a  VLSI  layout)  of 
the  tree. 

Following  the  latter  example,  fat-tree  topologies  [30]  satisfy  our  assumption  if  the  cost 
of  communication  between  two  nodes  is  determined  by  the  total  cost  of  a  shortest  path 
between  them,  where  the  cost  assigned  to  individual  edges  grows  at  an  appropriate  geo¬ 
metric  rate  as  we  move  higher  in  the  tree.  Fat-trees  are  of  particular  interest  here,  because 
of  all  the  most  commonly  studied  fixed-connection  network  families,  the  fat-tree  captures 
the  hierarchical  structure  of  most  wide-area  networks,  and  may  provide  the  most  plausible 
approximation  to  the  structure  of  current  wide-area  networks. 

Even  so,  it  is  probably  inappropriate  to  attempt  to  model  the  Internet,  say,  with  any  kind 
of  uniform  topology,  including  the  fat-tree.  Note  that  our  assumption  on  the  cost  function  is 
purely  “local”  in  nature,  and  allows  for  the  possibility  of  a  network  with  a  highly  irregular 
global  structure.  This  may  be  the  most  important  characteristic  of  our  cost  model. 

Performance  bounds.  We  show  that  our  access  scheme  achieves  optimality  or  near¬ 
optimality  in  terms  of  several  important  complexity  measures  for  the  restricted  class  of  cost 
functions  discussed  above.  In  particular,  our  scheme  achieves  the  following  bounds: 
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•  The  expected  cost  for  any  read  request  is  asymptotically  optimal. 

•  If  the  number  of  objects  that  can  be  stored  at  each  node  is  q,  then  the  additional  memory 
required  is  0{q  log^  n)  words  with  high  probability,  where  a  word  is  an  0(log  n)-bit 
string.  Thus,  if  the  objects  are  sufficiently  large,  i.e.,  fl(log^  n)  words,  the  memory  for 
objects  dominates  the  additional  memory. 

•  The  expected  cost  of  an  insert  (resp.,  delete)  operation  at  node  u  is  0{C)  (resp., 
0(C' log  n)),  where  C  is  the  maximum  cost  of  communicating  a  single  word  message 
between  any  two  nodes. 

•  The  number  of  nodes  that  need  to  be  updated  upon  the  addition  or  removal  of  a  node  is 
0(log  n)  expected  and  O(log^  n)  with  high  probability. 

An  obvious  shortcoming  of  our  analysis  is  that  it  only  applies  to  the  restricted  class  of 
cost  functions  discussed  above.  While  we  do  not  expect  that  all  existing  networks  fall 
precisely  within  this  restricted  class,  we  stress  that  (i)  our  access  scheme  is  well-defined, 
and  functions  correctly,  for  arbitrary  networks,  and  (ii)  we  expect  that  our  access  scheme 
would  have  good  practical  performance  on  any  existing  network.  (Although  we  have  not 
attempted  to  formalize  any  results  along  these  lines,  it  seems  likely  that  our  performance 
bounds  would  only  degrade  significantly  in  the  presence  of  a  large  number  of  nontrivial 
violations  of  our  cost  function  assumptions.) 

Related  work.  The  basic  problem  of  sharing  memory  in  distributed  systems  has  been 
studied  extensively  in  different  forms.  Most  of  the  earlier  work  in  this  area  —  as  in  emula¬ 
tions  of  PRAM  on  completely-connected  distributed-memory  machines  (e.g.,  [21,  54])  or 
bounded-degree  networks  (e.g.,  [45]),  and  algorithms  for  providing  concurrent  access  to  a 
set  of  shared  objects  [40]  —  assiune  that  each  of  the  nodes  of  the  network  has  knowledge 
of  a  hash  function  that  indicates  the  location  of  any  copy  of  any  object. 

The  basic  problem  of  locating  an  object  arises  in  every  distributed  system  [37],  and 
was  formalized  by  Mullender  and  Vitanyi  [38]  as  an  instance  of  the  distributed  matchmak¬ 
ing  problem.  Awerbuch  and  Peleg  [5],  and  subsequently  Bartal,  Fiat,  and  Rabani  [7]  and 
Awerbuch,  Bartal,  and  Fiat  [3],  give  near-optimal  solutions  in  terms  of  cost  to  a  related 
problem  by  defining  sparse-neighborhood  covers  of  graphs.  Their  studies  do  not  address 
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the  overhead  due  to  control  information  and  hence,  natural  extensions  of  their  results  to  our 
problem  may  require  an  additional  memory  of  m  words  at  some  node.  However,  we  note 
that  their  schemes  are  designed  for  arbitrary  cost  functions,  whereas  we  have  focused  on 
optimizing  performance  for  a  restricted  class  of  cost  functions. 

In  [6],  Awerbuch  and  Peleg  examine  the  problem  of  maintaining  a  distributed  directory 
server,  that  enables  keeping  track  of  mobile  users  in  a  distributed  network.  This  problem 
can  be  viewed  as  an  object  location  problem,  where  objects  migrate  in  the  network. 

In  recent  work,  access  schemes  for  certain  Internet  applications  have  been  described 
in  [18,  20,  55].  Some  of  the  ideas  in  our  scheme  are  similar  to  those  in  [55];  however,  the 
two  schemes  differ  considerably  in  the  details.  Moreover,  the  schemes  of  [18]  and  [55] 
have  not  been  analyzed.  As  in  our  study,  the  results  of  [20]  concerning  locality  assume 
a  restricted  cost  model.  However,  their  cost  model,  which  is  based  on  the  ultrametric,  is 
different  from  ours.  Also,  their  algorithms  are  primarily  designed  for  problems  associated 
with  “hot  spots”  (i.e.,  popular  objects). 

In  [31],  Maggs  et  al.  investigate  both  the  problem  of  determining  the  placement  of 
copies  of  the  objects  in  the  network,  and  the  problem  of  devising  an  efficient  access  scheme, 
with  the  main  goal  of  keeping  the  edge  congestion  low.  Their  work  considers  cost  mod¬ 
els  that  arise  in  some  restricted  network  topologies,  such  as  trees,  meshes,  and  clustered 
networks. 

A  closely  related  problem  is  that  of  designing  a  dynamic  routing  scheme  for  net¬ 
works  [4, 11].  Such  a  scheme  involves  maintaining  routing  tables  at  different  nodes  of  the 
network  in  much  the  same  way  as  our  additional  memory.  However,  in  routing  schemes  the 
size  of  additional  memory  is  a  function  of  network  size,  i.e.,  n,  while  in  our  problem  the 
overhead  is  primarily  a  function  of  m.  Straightforward  generalizations  of  routing  schemes 
result  in  access  schemes  that  require  an  additional  memory  of  m  words  at  each  node. 

The  remainder  of  this  paper  is  organized  as  follows.  Section  2.2  defines  the  model 
of  computation.  Section  2.3  presents  our  access  scheme.  Section  2.4  contains  a  formal 
statement  of  the  main  results.  Section  2.5  analyzes  the  algorithm  and  establishes  the  main 
results.  Section  2.6  discusses  directions  for  future  research. 
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2.2  Model  of  computation 

We  consider  a  set  V"  of  n  nodes,  each  with  its  own  local  memory,  sharing  a  set  A  of 
m  =  poly(n)  objects.  We  define  our  model  of  computation  by  characterizing  the  follow¬ 
ing  aspects  of  the  problem;  (i)  objects,  (ii)  communication,  (iii)  local  memory,  (iv)  local 
computation,  and  (v)  complexity  measures. 

Objects.  Each  object  A  has  a  unique  (log  m) -bit  identification.  For  i  in  [logm],  we 
denote  the  *th  bit  of  the  identification  of  Ahy  A\  Each  object  A  consists  of  I.{A)  words, 
where  a  word  is  an  O(log  n)-bit  string. 

Communication.  Nodes  communicate  with  one  another  by  means  of  messages;  each 
message  consists  of  at  least  one  word.  We  assume  that  the  underlying  network  supports 
reliable  communication. 

We  define  the  cost  of  communication  by  a  function  c:V^  i->  R.  For  any  two  nodes  u  and 
vmV,  c(u,  u)  is  the  cost  of  transmitting  a  single-word  message  from  u  to  v.  We  assume 
that  c  is  symmetric  and  satisfies  the  trizuigle  inequality.  We  also  assume  for  simplicity 
that  for  u,  v,  and  w  in  V,  c(u,  v)  =  c(m,  w)  if  and  only  if  u  =  u;.  (We  make  the  latter 
assumption  for  the  sake  of  convenience  only,  and  with  essentially  no  loss  in  generality, 
since  an  arbitrarily  small  perturbation  in  the  cost  function  can  be  used  to  break  ties.) 

The  cost  of  transmitting  a  message  of  length  i  from  node  u  to  node  v  is  given  by 
f{£)c{u,  v),  where  /  :  N  i->  R'''  is  any  nondecreasing  function  such  that  /(I)  =  1. 

Given  any  uinV  and  any  real  r,  let  M{u,  r)  denote  the  set  {v  eV  :  c(u,  v)  <  r}.  We 
refer  to  M{u,  r)  as  the  ball  of  radius  r  around  u.  We  assume  that  there  exist  real  constants 
^  >  8  and  A  such  that  for  any  node  u  in  V  and  any  real  r  >  1,  we  have 

min{(J|M(u,r)|,n}  <  \M{u,2r)\  <  A\M{u,r)\.  (2.1) 

as  illustrated  in  Figure  2.1. 

Local  Memory.  We  partition  the  local  memory  of  each  node  u  into  two  parts.  The  first 
part,  the  main  memory,  stores  objects.  The  second  part,  the  auxiliary  memory,  is  for  storing 
possible  control  information. 
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Figure  2.1:  Properties  of  the  cost  function. 

Local  Computation.  There  is  no  cost  associated  with  local  computation.  (Although 
the  model  allows  an  arbitrary  amount  of  local  computation  at  zero  cost,  our  algorithm  does 
not  perform  any  particularly  complex  local  operations.) 

Complexity  measures.  We  evaluate  any  solution  on  the  basis  of  four  different  com¬ 
plexity  measures.  The  first  measure  is  the  cost  of  reading  an  object.  The  second  measure 
is  the  size  of  the  auxiliary  memory  at  any  node.  The  remaining  two  measures  concern  the 
dynamic  nature  of  the  problem:  We  address  the  complexity  of  inserting  or  deleting  a  copy 
of  an  object,  and  of  adding  or  removing  a  network  node.  The  third  measure  is  the  cost 
of  inserting  or  deleting  a  copy  of  an  object.  The  fourth  measure  is  adaptability,  which  is 
defined  as  the  number  of  nodes  whose  auxiliary  memory  is  updated  upon  the  addition  or 
removal  of  a  node.  (Our  notion  of  adaptability  is  analogous  to  that  of  [1 1].) 


2.3  The  access  scheme 

In  this  section,  we  present  our  access  scheme  for  shared  objects.  We  assume  that  n  is  a 
power  of  2*",  where  6  is  a  fixed  positive  integer  to  be  specified  later  (see  the  beginning  of 
Section  2.5).  For  each  node  x  in  V,  we  assign  a  label  independently  and  uniformly  at 
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random  from  [n].  For  i  in  [log  n],  let  x®  denote  the  ith  bit  of  the  label  of  x.  Note  that  the 
label  of  a  node  x  is  independent  of  the  unique  (log  n)-bit  identification  of  the  node.  For  all 
X  in  V  (resp.,  Ain  A),  we  define  x[i]  as  the  nonnegative  integer  with  binary  representation 
^(j+i)6-i . .  .^^b  denotes  for  i  in  [(log  n)/b].  We  also  assign  a 

total  order  to  the  nodes  in  V,  given  by  the  bijection  (3  :  V  [n]. 

We  partition  the  auxiliary  memory  of  each  node  in  two  parts,  namely  the  neighbor  table 
and  the  pointer  list  of  the  node. 

•  Neighbor  table.  For  each  node  x,  the  neighbor  table  of  x  consists  of  (log  n)/b  levels. 
The  ith  level  of  the  table,  i  in  [(log  n) /b],  consists  of  primary,  secondary,  and  reverse 
{i,j)-neighbors,  for  all  j  in  [2*’].  The  primary  {i,j)-neighbor  j/  of  x  is  such  that 
y{k]  =  x[A;]  for  all  A:  in  [i],  and  either  (i)  i  <  (logn)/6  —  1  and  y  is  the  node  of 
minimum  c{x,y)  such  that  y[i]  =  j,  if  such  a  node  exists,  or  (ii)  y  is  the  node 
with  largest  ^{y)  among  all  nodes  such  that  z[i]  matches  j  in  the  largest  number 
of  rightmost  bits.  Note  that  the  primary  (i,j) -neighbor  of  a  node  x  is  guaranteed 
to  exist,  since  x  itself  is  a  candidate  node.  Let  d  be  a  fixed  positive  integer,  to  be 
specified  later  (see  the  beginning  of  Section  2.5).  Let  y  be  the  primary  {i,j )-neighbor 
of  X.  If  y[i]  =  j,  then  let  Wi^j  denote  the  set  of  nodes  k;  in  y  \  {y}  such  that 
w[k]  =  x[k]  for  k  in  [i],  t/;[i]  =  j,  and  c{x,w)  is  at  most  0(c(x,y)).  Otherwise, 
let  Wij  be  the  empty  set.  The  set  of  secondary  (i,j) -neighbors  of  x  is  the  subset  U 
of  min{d,  jlFijI}  nodes  u  with  minimum  c{x,u)  in  Wi^j  —  i.e.,  c{x,u)  is  at  most 
c(x,  w),  for  all  w  in  Wij,  and  for  all  u  in  [/,  and  |17|  <  d.  A  node  w  is  a  reverse 
{i,j)-neighbor  of  x  if  and  only  if  x  is  a  primary  (i,  j) -neighbor  of  w. 

In  Figure  2.2,  we  illustrate  the  primary  neighbors  entries  in  the  neighbor  table  of 
node  X  for  fe  =  1;  suppose  the  level  i -neighbors  of  x  in  the  table  are  given  by  (i) 
above. 

•  Pointer  list.  Each  node  x  also  maintains  a  pointer  list  Ptr{x)  with  pointers  to  copies 
of  some  objects  in  the  network.  Formally,  Ptr{x)  is  a  set  of  triples  {A,  y,  k),  where  A 
is  in  .A,  y  is  a  node  that  holds  a  copy  of  A,  and  A:  is  an  upper  bound  on  the  cost  c{x,y). 
We  maintain  the  invariant  that  there  is  at  most  one  triple  associated  with  any  object 
in  Ptr{x).  The  pointer  list  of  x  may  only  be  updated  as  a  result  of  insert  and  delete 
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Figure  2.2:  The  primary  neighbor  table  of  node  x,  for  6  =  1 . 

operations.  All  the  pointer  lists  can  be  initialized  by  inserting  each  shared  copy  in 
the  network  at  the  start  of  the  computation.  We  do  not  address  the  cost  of  initializing 
the  auxiliary  memories  of  the  nodes. 

Let  r  be  the  node  whose  label  matches  (in  terms  of  binary  representation)  the  identifi¬ 
cation  of  A  in  the  largest  number  of  rightmost  bits.  (In  case  of  a  tie  between  several  nodes 
ro, . . . ,  rk,  let  r  be  the  unique  node  maximizing  I3{ri).)  We  call  r  the  root  node  for  object 
A.  The  uniqueness  of  the  root  node  for  each  A  in  >1  is  crucial  to  guarantee  the  success  of 
every  read  operation. 

In  this  section  and  throughout  the  paper,  we  use  the  notation  (q  )^,  to  denote  the  sequence 
tto, . . . ,  a*,  (of  length  /c-|- 1).  When  clear  from  the  context,  k  will  be  omitted.  In  particular,  a 
primary  neighbor  sequence  for  A  is  a  maximal  sequence  (r/)^.  such  that  mq  is  in  V,  is  the 
root  node  for  A,  and  Uj+i  is  the  primary  (i,  A[i]  )-neighbor  of  Ui,  for  all  i.  It  is  worth  noting 
that  the  sequence  {u)f,  is  such  that  the  label  of  node  Ui  satisfies  («,[?'  —  1], . . . ,  ^,[0])  = 
(A[i-  l],...,A[0]),forallL 

We  now  give  an  overview  of  the  read,  insert,  and  delete  operations. 
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Figure  2.3:  A  read  request  for  object  A  is  forwarded  along  the  primary  neighbor  sequence 
for  .1  with  .1  =  .To. 

•  Read.  Consider  a  node  x  attempting  to  read  an  object  A.  The  read  operation  proceeds 
by  successively  forwarding  the  read  request  for  object  A  originating  at  node  x  along 
the  primary  neighbor  sequence  (t)  for  A  with  tq  =  x.  When  forwarding  the  read 
request,  node  t;_i  informs  Xi  of  the  current  best  upper  bound  k  on  the  cost  of  sending 
a  copy  of  A  to  T.  On  receiving  the  read  request  with  associated  upper  bound  k,  node 
Xi  proceeds  as  follows.  If  Xi  is  the  root  node  for  A,  then  t,  requests  that  the  copy 
of  A  associated  with  the  current  best  upper  bound  k  be  sent  to  x.  Otherwise,  Xi 
conununicates  with  its  primary  and  secondary  {i,  A[z])-neighbors  to  check  whether 
the  pointer  list  of  any  of  these  neighbors  has  an  entry  {A,  z,ki)  such  that  ki  is  at 
most  k.  Then,  Xi  updates  k  to  be  the  minimum  of  k  and  the  smallest  value  of  ki  thus 
obtained  (if  any).  If  k  is  within  a  constant  factor  of  the  cost  of  following  (t)  up  to 
Xi,  that  is,  k  is  c{xj,  Tj+i )),  then  t,  requests  that  the  copy  of  A  associated 

with  the  upper  bound  k  be  sent  to  .r .  Otherwise,  t,  forwards  the  read  request  to  Xj+i . 
Figure  2.3  illustrates  an  example  of  a  read  request  for  object  A  generated  by  node 
X  =  To,  which  is  forwarded  along  (t)  until  a  pointer  to  a  copy  of  A  is  found  at  node 
y,  one  of  the  secondary  neighbors  of  node  X4. 

Relating  to  the  more  general  description  of  the  read  operation  of  our  scheme  de¬ 
scribed  in  Section  2.1,  the  tree  T  associated  with  object  A  is  given  by  the  following 
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the  root  of  T  is  the  root  for  A 


Figure  2.4:  The  tree  T  associated  with  object  A  and  the  primary  neighbor  sequence  for 

X  =  Xq. 


rule:  The  parent  of  node  a-  in  T  is  the  primary  (?', /I [i]) -neighbor  of  x,  where  i  is 
the  maximum  index  such  that  (a-f?  -  1], . . .  3’[0])  =  {A[i  -  1], ,  A[0]);  or  in  other 
words,  the  parent  of  node  x  is  the  node  .t,  in  the  primary  neighbor  sequence  (a-)  for 
A  with  X  =  Xq.  Figure  2.4  illustrates  the  tree  T  and  the  sequence  (x)  (without  loss 
of  generality,  suppose  all  the  .t,  ’s  in  this  sequence  are  distinct). 

•  Insert.  An  insert  request  for  object  A  generated  by  node  y  updates  the  pointer  lists 
of  the  nodes  in  some  prefix  of  the  primary  neighbor  sequence  (y)  for  A  with  yo  =  y- 
When  such  an  update  arrives  at  a  node  yi  by  means  of  an  insert  message,  yi  updates 
its  pointer  list  if  the  upper  bound  Ej=o  ^yj^Vj+i )  on  the  cost  of  getting  object  A 
from  y  is  smaller  than  the  current  upper  bound  associated  with  A  in  this  list.  In  other 
words,  yi  updates  Ptr{yi)  if  (A,  •,  •)  is  not  in  this  Ust,  or  if  (A,  •,  A;)  is  in  Ptr{yi)  and 
k  is  greater  than  c(yj,  yj+i).  Node  y,  forwards  the  insert  request  to  node  y^+i 
only  if  Ptr{yi)  is  updated. 

•  Delete.  A  delete  request  for  object  A  generated  by  node  y  eventually  removes  all 
triples  of  the  form  {A,y,-)  from  the  pointer  lists  Ptr{yi),  where  (y)  is  the  primary 
neighbor  sequence  for  A  with  yo  =  y,  making  the  copy  of  A  at  y  unavailable  to  other 
nodes  in  the  network.  Upon  receiving  such  a  request  by  means  of  a  delete  message, 
node  yi  checks  whether  the  entry  associated  with  A  in  its  pointer  list  is  of  the  form 
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(^,  y,  ■).  If  it  is  not,  the  delete  procedure  is  completed  and  we  need  not  proceed 
further  in  updating  the  pointer  lists  in  (y).  Otherwise,  yi  deletes  this  entry  from  its 
pointer  list,  and  checks  for  entries  associated  with  A  in  the  pointer  lists  of  its  reverse 
{i  -  l,A[i  -  l])-neighbors.  If  an  entry  is  found,  y,  updates  Ptr{yi)  by  adding  the 
entry  (A,  w,k  +  c{w,  j/j)),  where  w  is  the  reverse  (i  - 1,  -  l])-neighbor  of  yi  with 

minimum  upper  bound  k  associated  with  A  in  its  pointer  list.  A  delete  message  is 
then  forwarded  to  j/j+i . 

The  read,  insert,  and  delete  operations  are  summarized  in  Figures  2.5  and  2.6.  The 
messages  and  requests  in  the  figure  are  all  with  respect  to  object  A.  A  read  request  is 
generated  by  node  x  when  x  (=  xq)  sends  a  message  Read{x,  oo,  •)  to  itself,  if  x  does  not 
hold  a  copy  of  A.  A  read  message  Read{x,  k,  y)  indicates  that  (i)  a  read  request  for  object 
A  was  generated  at  node  x,  (ii)  the  current  best  upper  bound  on  the  cost  of  reading  a  copy 
of  A  is  k,  and  (iii)  such  a  copy  resides  at  y.  An  insert  (resp.,  delete)  request  is  generated 
when  node  y  (=  j/o)  sends  a  message  Insert{y,  0)  (resp.,  Delete{y))  to  itself.  An  insert 
message  Insert{y,  k)  indicates  to  its  recipient  node  that  the  best  known  upper  bound  on 
the  cost  incurred  by  bringing  the  copy  of  A  located  at  y  to  the  node  z  is  k.  We  assume  that 
y  holds  a  copy  of  A  and  that  this  copy  is  unshared  (resp.,  shared)  when  an  insert  (resp., 
delete)  request  for  A  is  generated  at  y. 

The  correctness  of  our  access  scheme  follows  from  the  two  points  below: 

1.  The  insert  and  delete  procedures  maintain  the  following  invariants.  For  any  Ain  A 
and  any  y  in  V,  there  is  at  most  one  entry  associated  with  A  in  the  pointer  list  of  y. 
If  y  holds  a  shared  copy  of  A  and  (y)  is  the  primary  neighbor  sequence  for  A  with 
yo  =  y,  then  (i)  there  is  an  entry  associated  with  A  in  the  pointer  list  of  every  node 
in  (y),  and  (ii)  the  nodes  that  have  a  pointer  Ust  entry  associated  with  the  copy  of  A 
at  y  form  a  prefix  subsequence  of  (y).  The  preceding  claims  follow  directly  from  the 
insert  and  delete  procedures  as  described. 

2.  Every  read  request  for  any  object  A  by  any  node  x  is  successful;  that  is,  it  locates 
and  brings  to  a;  a  shared  copy  of  A,  if  such  a  copy  is  currently  available.  The  read 
operation  proceeds  by  following  the  primary  neighbor  sequence  {x)  for  A  with  xq  = 
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X,  until  either  a  copy  of  A  is  located  or  the  root  for  A  is  reached.  By  point  1  above, 
there  exists  a  shared  copy  of  A  in  the  network  if  and  only  if  the  root  for  A  has  a 
pointer  to  it. 


Action  of.r,  on  receiving  a  message  Read{x,  k,  y): 

If  ?  >  0  and  ./•,[/  —  \]^  A[i  —  l],or  i  =  (log??)/6—  1  (that  is,  .t,  is  the  root  for  A)  then 

•  Node  r,  sends  a  message  Satisfy{x)  to  node  v  such  that  (.4,  v,  •)  is  in  Ptr{xi), 
requesting  it  to  send  a  copy  of  A  to  x.  If  Ptr{xi)  has  no  such  entry,  then  there 
are  no  shared  copies  of  A. 

Otherwise 

•  Let  / '  be  the  set  of  secondary  (?,  /l[i]  )-neighbors  of  .r,.  Node  .r,  requests  a  copy 
of  .1  with  associated  upper  bound  at  most  k  from  each  node  in  U  {.Ti+i }. 

•  Each  node  i/  in  U  {x,+i}  responds  to  the  request  message  received  from  Xi  as 
follows:  if  there  exists  an  entry  (  A,  r,  q^,)  in  Ptr{u)  and  if  =  q^,  +  c(.r,,  u)  + 

)  is  at  most  k,  then  sends  a  success  message  Success(v,  q^)  to 

Xi. 

•  Let  r'  be  the  set  of  nodes  ii  from  which  xi  receives  a  response  message 
Success(ii.  ku ).  If  U'  is  not  empty,  then  Xj  updates  {k,y)  to  be  {k.^z),  where 
^  is  a  node  with  minimum  k^  over  all  u  in  V. 

•  If  A-  =  then  x,  sends  a  message  Satisjy{x)  to  node  y, 

requesting  y  to  send  a  copy  of  A  to  x.  Otherwise,  Xi  forwards  a  message 

Read{x,k,y)  toxi+i. 

Figure  2.5:  Action  on  receiving  a  message  Read  for  object  A. 


2.4  Performance  bounds 

In  this  section,  we  state  our  main  claims  regarding  the  performance  of  our  access  scheme. 
In  Theorems  1  through  4  below,  we  state  bounds  on  the  cost  of  a  read,  the  cost  of  an  insert 
or  delete,  the  size  of  auxiliary  memory,  and  the  adaptabihty  of  our  access  scheme. 

Theorem  1  Let  x  be  any  node  in  V  and  let  A  be  any  object  in  A.  If  y  is  the  node  with 
minimum  c(;r ,  y)  that  holds  a  shared  copy  of  A,  then  the  expected  cost  of  satisfying  a  read 
request  for  A  by  x  is  0(f(£{A)}c{x,  y}). 
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Let  C  denote  max{c(tf,  v):u,  v  e  V}.  If  a  node  x  tries  to  read  an  object  A  for  which 
there  is  currently  no  shared  copy  in  the  network,  then  the  expected  cost  of  the  read  operation 

is  0(C). 


Theorem  2  The  expected  cost  of  an  insert  operation  is  0(C),  and  that  of  a  delete  operation 

is  0(C  log  n). 


Theorem  3  Let  q  be  the  number  of  objects  that  can  be  stored  in  the  main  memory  of  each 
node.  The  size  of  the  auxiliary  memory  at  each  node  is  0(q  log^  n)  words  with  high  proba¬ 
bility. 


Theorem  4  The  adaptability  of  our  scheme  is  0(log  n)  expected  and  0(Iog^  n)  with  high 
probability. 


Action  of  yi  on  receiving  a  message  Insert(y,  k): 

If  (^5  •)  is  not  in  Ptr(yi),  or  (A,  •,  k')  is  in  Ptr(yi)  and  k'  >  k,  then 

•  Node  yi  accordingly  creates  or  replaces  the  entry  associated  with  A  in  Ptr{yi)  by 
inserting  (A,  y,  A;)  into  this  list. 

•  If  yi[i  -  1]  =  A[i  -  1]  then  yj  sends  a  message  Insert(y,  k  +  c(i/i,  j/,+i))  to 


Action  of  yi  on  receiving  a  message  Delete(y) : 

If  (A  y,  ■)  is  in  Ptr{yi)  then 

•  Let  U  be  the  set  of  reverse  (i  —  1,  —  1]) -neighbors  of  yi.  Node  yi  removes 

(A,  y,  •)  from  Ptr(yi),  and  requests  a  copy  of  A  from  each  u  in  U. 

•  Each  u  mU  responds  to  the  request  message  from  y,  by  sending  a  message 
Success(v,  qy  c(yi,  u))  to  y,  if  and  only  if  (A,  v,  qy)  is  in  Ptr(u). 

•  Let  U'  be  the  set  of  nodes  u  such  that  y*  receives  a  message  Success{u,  ky)  in 
response  to  the  request  message  it  sent.  If  |C/'|  >  0  then  y,  inserts  (A,  w,kw) 
into  Ptr(yi),  where  w  is  the  node  in  U'  such  that  ky,  <  ky,  for  all  u  in  U'. 

•  If  yi[i  -  1]  =  A[i  -  1]  then  y,-  sends  a  message  Delete(y)  to  y^+i. 


Figure  2.6:  Actions  on  receiving  messages  Insert  and  Delete  for  object  A. 
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2.5  Analysis 

In  this  section,  we  analyze  the  access  scheme  described  in  Section  2.3,  and  establish  the 
main  results  described  in  Section  2.4.  Section  2.5.1  presents  some  useful  properties  of 
balls.  Section  2.5.2  presents  properties  of  primary  and  secondary  neighbors.  Section  2.5.3 
presents  the  proofs  of  Theorems  1  and  2.  Sections  2.5.4  and  2.5.5  present  the  proofs  of 
Theorems  3  and  4,  respectively. 

Several  constants  appear  in  the  model,  the  algorithms,  and  the  analysis:  S  and  A  appear 
in  the  model,  b  and  d  appear  in  the  algorithms,  7  and  e  appear  in  the  analysis.  We  choose 


b,  d,  7,  and  e  such  that  the  following  inequalities  hold. 

2'’  >  A^7^  (2.2) 

d>-f^  (2.3) 

7  >  A^  (2.4) 

£  >  max{6A/7,4e-''/^^,6((/+  l)/2^6(2e/2^)'^/^6(eA7V^/)'^/2|  (2.5) 

£<  (10  21^-1  (2.6) 


An  assignment  of  values  to  the  constants  7,  b,  d,  and  e  that  satisfies  the  above  inequalities 
may  be  obtained  as  follows:  Set7  equal  to  2^'^^/A^/^,  d  equal  to  62^^/^+ and  £  equal 
to  6eA®/^/2*’/^.  The  preceding  assignment  satisfies  Equations  2.5  if  b  is  set  sufficiently 
large.  Equations  2.4  and  2.6  can  be  satisfied  by  setting  b  large  enough  so  that  2*"  >  A*  and 

26/3-ri)iog^2‘|  ^  60eA®/^. 


2.5.1  Properties  of  balls 

In  this  section,  we  prove  several  properties  of  the  “local  neighborhoods”  of  the  nodes  in 
V  with  respect  to  the  cost  function  c.  We  view  these  “neighborhoods”  as  balls  centered  at 
the  nodes  of  V.  In  Section  2.2,  we  defined  the  ball  of  radius  r  around  a  node  u,  M{u,  r). 
Now  we  define  the  ball  of  size  k  around  node  u,  N{u,  k),  for  any  u  in  V  and  any  integer 
k  in  [1,  n]:  Let  N{u,k)  denote  the  unique  set  of  k  nodes  such  that  for  any  v  in  N{u,k) 
and  w  not  in  N(u,  k),  c{u,  v)  is  less  than  c(ti,  to).  For  convenience,  if  k  is  greater  than  n, 


2.5.  ANALYSIS 


21 


Figure  2.7:  Illustrating  the  proof  of  Lemma  2.5.1. 

we  let  N{u,  k)  be  V.  As  for  the  balls  M{u,  r),  we  define  the  radius  of  N{u,  k)  to  be  the 
maximum  value  of  c{u,  v)  over  all  v  in  Ar(w,  k). 

In  the  proofs  of  the  lemmas  in  this  section,  we  extensively  use  Equation  2.1  as  well  as 
the  fact  that  the  cost  metric  is  symmetric  and  satisfies  the  triangle  inequality. 

Lemma  2.5.1  Let  u,  v,  and  w  be  in  V  and  let  ko  and  ki  be  positive  integers.  If  v  is  in 

N{u,  ko)  and  w  is  in  N{v,  ki),  then  w  is  in  N{u,  Ako  +  A^ki). 

Proof:  Let  ro  and  ri  denote  c{u,  v)  and  c(t>,  w),  respectively.  The  node  w  is  contained  in 
the  ball  M{u,c{u,w)).  If  tq  >  ri,  then  \M{u,c{u,w))\  is  at  most  \M{u,ro  +  ri)|,  which 
is  at  most  Ako  by  Equation  2.1.  Otherwise,  |M(u,  ro  +  ri)|  is  at  most  |M(u,  3ri)|,  which 
is  at  most  A^ki  by  Equation  2.1.  Therefore,  w  belongs  to  iV(u,  AA;o  +  A'^ki).  Figure  2.7 
illustrates  the  case  ro  <  ri.  Q.E.D. 

We  now  consider  the  smallest  (resp.,  the  largest)  ball  centered  at  a  node  v  that  contains 
(resp.,  is  contained  in)  some  given  subset  of  nodes.  Given  any  subset  5“  of  V  and  some 
node  u  in  S,  let  nc(w,  S)  (resp.,  S))  denote  the  largest  (resp.,  smallest)  integer  k 
such  that  N{u,  k)  is  a  subset  (resp.,  superset)  of  S.  Let  Nc{u,  S)  and  Nj{u,  S)  denote 
N{u,  nc(«,  S))  and  N{u,  nD(u,  S)),  respectively. 

Lemma  2.5.2  Let  u  be  in  V,  let  S  be  a  subset  ofV,  and  let  k  be  in  [1,  n].  Then  iV(u,  k)  is 
a  subset  ( resp.,  superset)  ofS  if  and  only  ifN{u,  k)  is  a  subset  ofNc  {u,S)  ( resp.,  superset 
ofN2iu,S)). 
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Proof:  If  N{u,  k)  is  a  subset  of  S  then  uq  (u,  S)  is  at  least  k;  hence,  N{ii,  k)  is  a  subset 
of  Nc(ii^S).  If  N{u,k)  is  a  subset  of  Nc{u,S)  then  N{u,k)  is  a  subset  of  S  because 
Nc(ii.  S)  is  a  subset  of  S.  If  N{u,  k)  is  a  superset  of  S  then  ng(ti,  5')  is  at  most  k-,  hence, 
A’( (/.  A')  is  a  superset  of  S).  If  A^(u.,  k)  is  a  superset  of  N-^{n,  S)  then  N{u^  k)  is  a 
superset  of  S  because  N^iu,  S)  is  a  superset  of  S.  Q.E.D. 

Lemma  2.5.3  Let  u  belong  to  V,  and  let  ko  and  ki  denote  positive  integers  such  that 

k\  >  A^ko-  For  any  V  in  N{u^ko),  nc{v,  N{u,ki))  is  at  least  ki / /S.  and  N^{v,N{uM)) 
is  a  subset  of  N (w,  Aki ). 

Proof:  We  first  obtain  a  lower  bound  on  ng  (v,  N{ii,  ki)).  Let  vq  and  ri  denote  the  radii  of 
A  (  II.  /,„)  and  N{u,  ki),  respectively.  Since  ki  >  A'^ko,  Equation  2.1  implies  that  rj  -  ro  > 
(r,  +  ro)/2.  Let  w  be  the  node  in  Nq {v,  N{u.  ki))  such  that  c{v,  w)  is  maximum.  A  ball  of 
radius  ( r,  -  ro)  around  v  is  contained  in  N{u,  ki )  (since  v  is  contained  in  N{u,  ko)).  Thus 
<  r(r,  w).  It  follows  that  2c(i^«;)  is  at  least  2(ri -ro)  >  ri +ro  and  M(r,  2c(t7,  ty)) 
is  a  superset  of  A^ ( u ,  ) .  We  now  obtain  a  lower  bound  on  n c  (r ,  A^ ( u ,  fci ) )  as  follows: 

nciv,N{ii,ki))  =  |M(r,c(y,M’))| 

>  I  A/(r,  2c(y,  t/’))|/A 

>  kJA. 

We  now  place  an  upper  bound  on  ??  3  ( r ,  N{u .  /ri ) ) .  Let  w  be  the  node  in  A^g  (r ,  N{u,  ki)) 
such  that  c(v,  w)  is  maximum.  We  have  ri  -  ro  <  c{v,  w)  <  ri  +  ro.  (We  showed  that 
I'l  —  '  0  <  c(u,  w)  in  the  preceding  paragraph,  and  the  other  inequality  follows  from  the 
triangle  inequality.)  It  follows  that  2(ri  -  ro)  is  at  least  c(v,ic)  and  M{v,c{v,w))  is  a 
subset  of  M(u,  2ri).  Therefore, 

n-^iiK  N{v,ki))  =  |Af(r,  c(t’,  u>))| 

<  |A/(t/,2ri)| 

<  Aki. 


Q.E.D. 
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We  use  Lemmas  2.5.1  to  2.5.3  to  prove  Lemma  2.5.4  below.  Lemma  2.5.4  and  Corol¬ 
lary  2.5.4.1  are  used  in  Section  2.5.3.  We  refer  to  any  predicate  on  V  that  depends  only  on 
the  label  of  z;  as  a  label  predicate.  Given  any  node  uinV  and  a  label  predicate  V  on  V,  let 
p(u,  V)  denote  the  node  v  such  that  (i)  V{v)  holds,  and  (ii)  for  any  node  w  such  that  V(w) 
holds,  c{u,  v)  is  at  most  c(u,  w).  (We  let  p{u,  V)  be  null  if  such  a  u  is  not  defined.)  Let 
P{u,V)  be  M{u,  c{u,p{u,V))),  if  p{u,'P)  is  notnuU,  and  V  otherwise. 

Lemma  2.5.4  examines  the  effect  of  the  relationship  between  the  set  P{u,  V)  and  the 
probability  distribution  of  the  labels  of  the  nodes,  for  any  given  node  u  and  label  predicate 
V.  For  u  in  V,  and  i  in  [(logn)/6],  let  A>i(u)  denote  the  string  of  (logn  —  ib)  bits  given  by 
u[(log  n)/b  —  !]■  ■  ■  u[i  +  l]u[i].  For  convenience,  we  let  \>i{u)  denote  A>j+i(u).  For  all  i 
and  all  u  in  V,  let  Vi{u)  hold  if  and  only  if  u[i]  =  A[i\.  For  all  i  and  all  u  in  V,  let  V<i{u) 
denote  Aj^[j]Pj(u).  Let  V<i{u),  V>i{u),  and  V>i{u)  be  defined  similarly.  We  note  that  for 
u  and  vinV,  and  nonnegative  integers  i  and  j,  if  (w  u)  V  {{u  =  v)  f\  (z  ^  j)),  then  Pi{u) 
and  Vj(v)  are  independent  random  variables.  Also,  each  of  the  predicates  defined  above  is 
a  label  predicate. 


Lemma  2.5.4  Let  S  and  S'  be  subsets  of  V,  and  let  u  belong  to  S.  Let  V  be  a  label 
predicate  on  V  and  for  each  v  in  S',  let  A>o(u)  be  chosen  independently  and  uniformly  at 
random. 

1.  Given  that  P{u,'P)  C  S,  we  have  that  (i)  the  variables  A>o(tz),  for  all  v  in  S'  \ 
P{u,  V),  are  independent  and  uniformly  random,  and  ( ii)for  each  node  v  in  P{u,  V)\ 
{p{u,  V)},  V{v)  is  false. 

2.  Given  that  P{u,V)  %  S,  we  have  that  (i)  the  variables  A>o(u)  for  all  v  in  S'  \ 
Nc  (u,  S)  are  independent  and  uniformly  random,  and  ( ii)for  each  node  v  in  Nq  {u,  S), 
V{v)  is  false. 

3.  Given  that  P{u,V)  D  S,  we  have  that  (i)  the  variables  A>o(u) /or  all  v  in  S'  \ 
N^iu,  S)  are  independent  and  uniformly  random,  and  (ii)  for  each  node  v  in  N^iu,  S)\ 
{p{u,V)},  V{v)  is  false. 
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Proof:  We  first  consider  Part  1  of  the  lemma.  Part  l(i)  follows  from  the  independence  of 
Viv)  and  V{w),  for  any  two  distinct  nodes  v  and  w.  By  the  definition  of  P,  V{p{u,  V)) 
holds  and  for  each  node  v  in  P{u,  V),  V{v)  is  false.  This  proves  Part  l(ii).  Parts  2  and  3 
follow  similarly.  For  Part  2,  we  note  that  the  event  P(w,  V)  %  S  is  equivalent  to  the 
event  that  for  each  node  v  in  Ngiu,  S),  V{v)  is  false.  For  Part  3,  we  note  that  the  event 
P{u,  V)  3  S'  is  equivalent  to  the  event  that  for  each  node  in  A'2(u,  S)  \  {n^iu,  S)},  V{v) 
is  false.  Q.E.D. 

The  following  claim  follows  from  repeated  application  of  Part  1  of  Lemma  2.5.4. 

Corollary  2.5.4.1  Let  S  be  an  arbitrary  subset  ofV,  let  i  be  in  [(log  n)/6  -  1],  and  let  S' 
be  a  subset  ofV  such  that  A>o(u)  is  independently  and  uniformly  random  for  each  u  in 
S'.  Given  a  sequence  of  nodes  uo, . . . ,  u,  such  that  for  all  j  in  [?],  uj+i  =  p{uj,  V<j)  and 
P{uj,V<j)  C  S,  we  have 

1.  The  variables  X>o{u)  for  all  n  in  S'  \  [Jj^[qP{ii,  V<j)  are  independent  and  uniformly 
random. 

2.  The  variable  A>i(u,)  is  uniformly  random  and  for  each  node  u  in  Uj^[i]P{uj,V<j)  \ 

P<iiu)  is  false.  Q.E.D. 


2.5.2  Properties  of  neighbors 

In  this  section,  we  establish  certain  claims  concerning  the  different  types  of  neighbors 
that  are  defined  in  Section  2.3.  We  differentiate  between  root  and  nonroot  primary  (ij)- 
neighbors.  A  root  primary  (?,j) -neighbor  lo  of  v  is  a  primary  (i,j) -neighbor  w  of  v  such 
that  j  or  i  —  (log  n)/b  -  1.  A  primary  neighbor  that  is  not  a  root  primary  neighbor 

is  a  nonroot  primary  neighbor.  Note  that,  for  i  <  (log  n)/6  -  1,  if  u  is  a  root  primary 
(?,j) -neighbor  of  v,  then  u[f]  equals  v[t],  for  each  ( in  [?],  and  there  is  no  node  tv  in  V  such 
that  w\i]  equals  j  and  w[f\  equals  for  each  C  in  [?']. 

Lemma  2.5.5  Let  u  and  v  be  in  V,  and  let  k  denote  |M(u,  c(j/,  t)))|.  For  any  j  in  [2^],  we 
have  that  (i)  for  any  i  in  [(logn)/6  —  1],  the  probability  that  u  is  a  primary  -neighbor 
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ofv  is  at  most  e  2)/2('+i)*'^  (ii)for  any  i  in  [(log  n)/b],  the  probability  that  u  is  a 

root  primary  {i,j)-neighborofv  is  at  most 

Proof:  Consider  the  ball  M{v,c{v,u)).  By  Equation  2.1,  lM(t;,c(u,u))|  > 

\M{v,2c{v,u))\/ A.  Since  M{v,2c{v,u))  is  a  superset  of  M(u, c(u,i;)),  we  have 
\M{v,c{v,u))\  >  k/A.  The  probability  that  a  node  w  in  M{v,c{u,v))  \  {m,u}  does 
not  match  the  label  of  u  in  its  {i  +  1)6  rightmost  bits  is  at  most  1  —  1  Since  i  is  less 

than  (log  n)fh  —  1,  the  probability  that  u  is  a  primary  (i,  j) -neighbor  of  v  is  at  most 

<  g-((fc/A)-2)/2(‘+')^_ 


If  u  is  a  root  primary  (*,  j)-neighbor  of  v,  then  u\t]  equals  v[l]  for  each  I  in  [i]  and  there  is 
no  node  «;  in  V  such  that  ^w[^]  equals  j  and  w[i]  equals  v[(]  for  each  I  in  [i].  Therefore,  the 
probability  that « is  a  root  primary  (i,  j) -neighbor  of  v  is  at  most 

(1/2*^)(1  -  i/2(*+i)^)"-i(l  _  1/2^) 

<  (1/2*'’)(1  -l/2('+^)^)” 

<  (l/2'*’)e-”/2‘’''^’\ 


Q.E.D. 

Corollary  2.5.5.1  Let  u  and  v  be  in  V,  let  i  be  in  [(log  n) /6],  and  let  j  be  in  [2^].  If  u  is  a 
primary  {i,j)-neighborofv,  then  v  is  in  N(u^  0(2*^  log  re))  with  high  probability. 

Q.E.D. 

Lemma  2.5.6  and  Corollary  2.5.6.1  below  establish  bounds  on  the  number  of  nodes  v 
such  that  re  is  a  primary  or  secondary  neighbor  of  re,  and  on  the  number  of  nodes  re  such  that 
re  is  a  reverse  neighbor  of  re,  respectively.  For  any  re  in  V,  let  denote  the  total  number  of 
triples  (i,  j,  re)  such  that  i  belongs  to  [(log  re)/6],  j  belongs  to  [2**],  re  belongs  to  V,  and  re  is  a 
primary  or  secondary  (i,  j) -neighbor  of  re.  Lemma  2.5.6  is  used  in  the  proof  of  Theorem  4, 
while  Corollary  2.5.6. 1  is  used  in  the  proofs  of  Theorems  2  and  3. 
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Lemma  2.5.6  Let  u  be  in  V  and  let  i  be  in  [(log  «)/6].  Then  the  number  of  nodes  for  which 
u  is  an  ith  level  primary  neighbor  is  0  {log  n)  with  high  probability.  Also,  E[au]  =  (9(logn) 
and  ttu  is  0(log^  n)  with  high  probability. 

Proof:  Given  a  node  v  in  V ,  i  in  [(log  n)/6— 1],  and  j  in  [2**],  it  follows  from  Lemma  2.5.5 
that  the  probability  that  u  is  a  root  primary  (7,j) -neighbor  of  v  is  at  most  (l/2'*’)e“” 

Given  a  node  v  mV  and  j  in  [2*’],  the  probability  that  u  is  a  root  ((log  7?)/6,j) -primary 
neighbor  of  v  is  at  most  Ifn. 

Fix  j  in  [2*’].  Let  i  equal  (log  n  -  log  log  ??)/6  -  fl(l),  where  the  constant  in  the  0(1) 
term  is  chosen  sufficiently  large.  We  consider  two  cases:  i  <  t  and  i  >  t.  If  i  <  i,  then 
the  probability  that  there  exists  v  in  V  such  that  «  is  a  root  primary  (7,  j) -neighbor  of  v  is 
at  most 


77(l/2'^)e“”/2'‘^‘’'’ 

=  0(l/poly(n)). 

If  7’  >  £,  then  given  v  in  V,  the  probability  that  77  is  a  root  primary  (7  ,  j  ) -neighbor  of  v 
is  at  most  1/2^*’  =  0((log  n) In).  It  follows  from  Chemoff  bounds  [10]  that  the  number  of 
nodes  for  which  i7  is  a  root  primary  (z,  /) -neighbor  is  C>(log  77 )  with  high  probability. 

We  now  consider  secondary  and  nonroot  primary  neighbors.  For  any  ?  in  [(log  n)lb],u 
is  a  secondary  or  nonroot  primary  (z ,  i)-neighbor  of  v  only  if  j  is  t/]?]  and  u  is  one  of  the 
d-\- 1  nodes  n;  in  ]/  with  minimum  c(u,  w)  whose  lowest  ib  bits  match  those  of  v.  We  now 
fix  u  and  i,  set  j  to  ^[z],  and  obtain  an  upper  bound  on  the  probability  that  v  is  one  of  the  at 
most  d  +  1  nodes  w  with  minimum  c{v,  w)  and  whose  first  ib  bits  match  those  of  v. 

Consider  a  node  v  in  N{u,  \  N{u,  where  pis  a  real  constant  to 

be  specified  later.  If  k  equals  zero,  then  the  probability  that  u  is  a  primary  or  secondary 
(z,j) -neighbor  of  v  is  at  most  l/2*^  Otherwise,  consider  the  ball  M{v,  c{v,  u)).  By  the 
low-expansion  condition  (i.e.,  the  right  inequality  of  Equation  2.1),  |M(t',c(7’,tz))|  is  at 
least  |M(z7,2c(7;,  m))|/A.  We  are  given  that  M{u,c{u,  z;))  is  a  superset  of  N{u, 

Since  M{v,2c{v,u))  is  a  superset  of  M{u,c{u,v)),  we  obtain  that  \M{v,c{v,u))\  is  at 
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least  The  probability  that  u  is  a  primary  or  secondary  (i,j) -neighbor  of  v  is 

at  most 

<  Ad{eii’^l{Ad)Y{e-^"l^l‘r^) 

<  l/((2^)''2**’). 

(The  second  step  holds  since  d  <  2^  <  2**’  and  (1  —  1/2*^)“^'*^  is  at  most  4.  The  third 
step  follows  by  choosing  fi  large  enough  with  respect  to  A  and  d  such  that  > 

(2^AVt/'^"^)(//A)‘^+i  for  all  A;  >  1.) 

Thus,  the  expected  number  of  nodes  for  which  u  is  a  secondary  or  nonroot  primary 
neighbor  is  at  most 

EE  E  i/((2/<)‘2») 

ie[{\ogn)/b],j=u[i]  k>0 

<  E  2V 

ie[(logn)/6],j=:u[j] 

=  O(log  n). 

To  obtain  a  high  probability  bound  on  the  number  of  nodes  for  which  u  is  a  secondary  or 
nonroot  primary  neighbor,  we  proceed  as  follows.  For  any  v  not  in  N{u,  0(2^®+^)'*  log  n)), 
it  follows  from  Lemma  2.5.5  that  the  probability  that  u  is  a  secondary  or  nonroot  primary 
(i,j) -neighbor  of  v  is  C>(l/poly(n)).  For  any  v  in  N{u,  0(2(®‘''^^^  log  n)),  the  probability 
that  u  is  a  secondary  or  nonroot  primary  (i,  j)-neighbor  of  v  is  at  most  1  Therefore, 

the  number  of  nodes  for  which  u  is  a  secondary  or  nonroot  primary  neighbor  is  O(log^  n) 
with  high  probability. 

The  bounds  on  expectation  and  the  high  probability  bounds  together  establish  that  £'[a„] 
is  O(log  n)  and  a„  is  O(log^  n)  with  high  probability.  Q.E.D. 

Corollary  2^.6.1  For  any  u  in  V,  the  total  number  of  reverse  neighbors  ofu  is  O(log^  n) 
with  high  probability,  and  0(log  n)  expected. 
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Proof:  The  desired  claim  follows  directly  from  Lemma  2.5.6,  since  v  is  a  reverse 
neighbor  of  u  only  if  w  is  a  primary  (^,  j) -neighbor  of  v.  Q.E.D. 

Lemma  2.5.7  is  used  in  the  proof  of  Theorem  3.  For  a  given  a  node  u,  it  provides  a 
bound  on  the  number  of  primary  neighbor  sequences  that  have  u  in  the  z  th  position.  For 
any  u  and  v  in  V  and  i  in  [(log  n  )/b],v  is  said  to  be  an  i-leaf  of  ii  if  there  exists  a  sequence 
V  =  Vo,  fi, . . . ,  Vi-i,  v,  =  u,  such  that  for  all  j  in  [^],  t’j+i  is  a  primary  (j,  Vj+j  [j])-neighbor 
ofvj. 

Lemma  2.5.7  Let  u  belong  to  V,  and  let  i  be  in  [(log  n)/6].  Then  the  number  of  i-leaves 
ofu  is  0(2®^  log  n)  with  high  probability. 

Proof:  We  establish  the  lemma  by  showing  that  if  v  is  an  ?-leaf  of  u,  then  v  is  in 
N (u,  co2’'’  log  n)  with  high  probability,  where  cq  is  a  real  constant  to  be  specified  shortly 
(see  the  next  paragraph).  By  Corollary  2.5 .5.1,  we  have  that  for  all  j  in  [«],  vj  is  in 
A^(vj+i, log n)  with  high  probability  for  some  sufficiently  large  real  constant  ci. 
We  will  prove  by  induction  on  j  in  [?'  4-  1]  that  v  =  vo  is  in  N{vj,  co2^^  log  ??)  with  high 
probability. 

The  induction  base  follows  trivially.  For  the  induction  step,  assume  that  v  is  in 
N(vj,  co2'^*’  log  n).  By  Corollary  2.5 .5.1,  Vj  belongs  to  N{vj+i ,  log  n)  with  high  prob¬ 
ability.  Applying  Lenuna  2.5.1  with  the  substitution  (t>j+i ,  Vj,  v)  for  (u,  v,  tv),  we  obtain 
that  V  is  in  A(vj+i,(Aci  -|-  A^co)2-’*’ log  ??),  with  high  probability.  Since  A^  <  2^  (by 
Equation  2.2),  we  can  choose  cq  large  enough  so  that  co(2^  -  A^)  is  at  least  Aci.  It  thus 
follows  that  V  is  in  A(vj+i,  co2(-^+‘^^). 

Applying  the  above  inductive  claim  with  j  =  z,  we  obtain  that  v  isin  log  n)) 

with  high  probability,  completing  the  proof.  Q.E.D. 

2.5.3  Cost  of  Operations 

In  this  section,  we  place  upper  bounds  on  the  cost  of  the  read,  insert,  and  delete  oper¬ 
ations  by  establishing  Theorems  1  and  2.  We  first  introduce  some  notation  and  prove  a 
few  elementary  lemmas  in  Section  2.5 .3.1.  The  bulk  of  the  analysis  is  in  Sections  2.5. 3. 2 
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and  2.5.3.3.  Using  the  tools  developed  in  these  two  sections,  we  finally  prove  Theorems  1 
and  2  in  Section  2.5.3.4.  Before  beginning  the  analysis,  we  remark  that  most  of  the  notation 
and  tools  developed  in  Sections  2.5.3. 1,  2.5.3.2,  and  2.5.3.3  are  only  used  in  the  analysis 
of  the  read  operation. 

2.5.3.1  Preliminaries 

Consider  a  read  request  originating  at  node  x  for  an  object  A.  Let  y  denote  a  node  that 
has  a  copy  of  A.  In  the  following,  we  show  that  the  expected  cost  of  a  read  operation 
is  0{f{£{A))c{x,  y)).  Letting  y  denote  the  node  with  minimum  c{x,  y)  among  the  set  of 
nodes  that  have  a  copy  of  A,  this  bound  implies  that  the  expected  cost  is  asymptotically 
optimal. 

Let  {x)  and  (y)  be  the  primary  neighbor  sequences  for  A  with  xq  =  x  and  yo  =  y, 
respectively.  For  any  nonnegative  integer  i,  let  Ai  (resp.,  DA  denote  the  ball  of  smallest 
radius  around  Xi  (resp.,  yO  that  contains  Xj+i  (resp.,  yi+i).  Let  5,  (resp.,  Ei)  denote  the 
set  Uo<j<iAj  (resp.,  [Jo<j<iDj).  Let  Ci  denote  the  ball  of  smallest  radius  around  x,  that 
contains  all  of  the  secondary  (i,  A [z]) -neighbors  of  Xi.  For  convenience,  we  define  5_i  = 

E-i  =  0. 

It  is  useful  to  consider  an  alternative  view  of  x,,  yi,  Ai,  and  Di.  For  any  nonnegative  i, 
if  Xi^i  (resp.,  y^+i)  is  not  the  root  node  for  A,  then  x,+i  (resp.,  yi+i)  is  p(xi,'P<i)  (resp., 
PiVi,  ^<i))  and  Ai  (resp.,  A)  is  P(xi,  V<i)  (resp.,  P{yi,  V<i)). 

Let  7  be  an  integer  constant  satisfying  Equations  2.2  through  2.6.  For  any  nonneg¬ 
ative  integer  i  and  any  integer  j,  let  Xf  (resp.,  Y/)  denote  the  ball  N{x,  (resp., 

N{y,  7^2^®+^)^)).  Let  i*  denote  tbe  least  integer  such  that  the  radius  of  A^.  is  at  least  c(x,  y). 
Let  Ci  (resp.,  bi)  denote  the  radius  of  A/  (resp.,  1^^). 

Lemma  2.5.8  For  all  i  such  that  i  >  i*,  Xf  is  a  superset  ofY^. 

Proof:  By  the  definition  of  i*,  Oi  is  at  least  c(x,  y).  Therefore,  M{y,  2ai)  is  a  superset  of 
A/.  Hence,  M{y,  2ai)  contains  at  least  72(®‘''^)*’  nodes  and  is  a  superset  of  Y^. 
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By  Equation  2.1,  \M{x,3ai)\  is  at  most  A'^\M{x,ai)\  <  A^|A7|.  By  Equation  2.4, 
A^IA/I  <  A272<’+')^  <  722(*+i)^  Thus,  M{x,3a,)  is  a  subset  of  A^?.  Since  M{x,3ai)  is 
a  superset  of  M(y.  2a,  ),  which  is  a  superset  of  >7,  the  claim  holds.  Q.E.D. 

Lemma  2.5.9  For  all  i  in  [(logn)/6  —  2]  we  have  21-^’°^^  ^Ja,-  <  0,7.1  <  and 

2L*>'»6a2J^.  <  Fori  —  (logn)/6  — 2,  we  have  a <  2 a,- one? 

^>i+i  <  21^^'"*^'  Also,  a,,  and  hi-  are  both  0{c{x,  y)). 

Proof:  We  prove  the  bounds  for  0,7.1  and  o,» .  The  bounds  for  6,7.1  and  6,*  follow  the  same 
lines.  Since*  <  2*' by  Equation  2.4,  for  all  ?  in  [(log??) /6  —  2],  we  have  |A,7j|  =  2*’|A/|. 
Therefore,  for  all  /  in  [( log  ??  )/6-  2],  it  follows  from  Equation  2.1  that  21^*’'°®^  o,  <  0,7.1  < 
2r«>io6^  Pq,.;  _  (log7?)/6-2,  |A7+iI  <  27A7I,  and  hence,  0,7.1  <  21^.^ 

If?*  >  0,  theno,.  (resp.,  6,.)isatmost2^^'°®''^lc(3’,?/)  =  0{c(x,y)).  Otherwise, o,.  is 
(9(2riog.ol)  =  ()(c(x.y)),  since  5  and  7  are  constants.  Q.E.D. 

We  define  two  sequences  (5,)  and  (t,  )  of  nonnegative  integers  as  follows: 

r  0  if5iCA7,.4, OA77C, OA7, 

I  1  if  5,  C  XI ,  .4,  D  .v,-7  Ci  2  A7, 

I  2  ifJ5,CA7,.4,2A77and 

.  3  +  j  if  0  <  7  <  ? ,  5,_,  g  XI ,  5,_,_i  C  A7  . 

_  f  0  il  Ei  C  4  7 ,  and 

“  ~  1  1  +  i  if  0  <  i  <  ?,  Ei.j  <l  17,  Ei.,.,  C  >7. 

The  following  intuition  underlies  the  above  definitions  of  3,  and  t,.  For  any  j,  the 
expected  sizes  of  the  balls  A,  and  Di  are  both  2*'’^**^.  Thus,  the  expected  sizes  of  the  balls 
Bi  and  Ei  are  both  at  most  .  Moreover,  the  expected  size  of  C,  is  at  least  Ci2(*+')*', 

where  ci  is  a  constant  that  depends  on  d.  The  constant  7  is  chosen  sufficiently  large  and  d 
is  chosen  sufficiently  larger  than  7  such  that  the  “expected  behavior”  of  the  balls  Ai,  Bi, 
Ci,  and  Ei  is  as  follows:  Bi  C  A7,  A,  3  A77  Ci  A  Xf,  and  E,  C  >7.  The  value  of  3, 
(resp.,  ti)  indicates  the  degree  to  which  the  sizes  of  the  balls  A„  Bi,  and  C,  (resp.,  £),  and 
Ei)  deviate  from  this  expected  behavior.  The  larger  the  value  of  3,,  the  greater  the  deviation 
from  the  expected  behavior. 
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Lemma  2.5.10  If  Si  is  in  {0, 1,2},  then  c{xi,xi+i)  is  0{ai).  Ifti  is  0,  then  c(j/,,  j/i+i)  is 

Oik). 


Proof:  The  proof  of  the  first  claim  follows  from  the  observation  that  if  Si  is  in  {0, 1, 2} 
then  Ai  C  Bi  C  Xf  The  proof  of  the  second  claim  follows  from  the  observation  that  if  ti 
is  0  then  D,  C  Ei  C  Yf.  Q.E.D. 

2.53.2  Properties  of  (si)  and  (U) 

Our  plan  for  determining  an  upper  bound  on  the  cost  of  the  given  read  operation  for  object 
.1  is  as  follows.  Let  r  be  the  smallest  integer  i  >  i*  such  that  (siAi)  =  (0,0).  By  the 
definitions  of  5,.  and  L,  Cr  2  X'^  and  Yf  D  Er  2  Dr.  By  Lemma  2.5.8,  X^  D  Y^,  thus 
implying  that  Cr  is  a  superset  of  Dr.  Thus,  a  copy  of  A  is  located  within  r  forwarding  steps 
along  (ir).  By  the  definition  of  the  primary  and  secondary  neighbors,  the  cost  of  any  request 
(resp.,  forward)  message  sent  by  node  Xi  is  at  most  0(c(a;„  aii+i ))  (resp.,  cixi ,  Xi+i )).  Since 
a  copy  of  A  is  located  within  r  forwarding  steps,  the  cost  of  all  messages  needed  in  locating 
the  particular  copy  of  A  that  is  read  is  at  most  0(Eo<j<T(<^c(a:j,  Xj+i)  +  c(j/j, y^+i))). 
Figure  2.8  illustrates  a  read  request  for  object  A  generated  by  node  x  =  xq,  which  is 
forwarded  along  (x)  until  a  pointer  to  the  copy  of  A  residing  at  node  y  is  found  at  node  t/3 
(a  secondary  neighbor  of  0:4).  The  cost  of  reading  the  copy  is  at  most  /(£(A))  times  the 
preceding  cost.  Since  d  is  a  constant,  the  cost  of  reading  A  is  at  most 

Oifi£iA))ic{xj,Xj+i)Ac(yj,yj+i)).  (2.7) 

0<j<r 

The  remainder  of  the  proof  concerns  the  task  of  showing  that  ®j+i)  + 

c(j/j,  j/j+i))]  is  0{cix,  y)).  A  key  idea  is  to  establish  that  the  sequence  (sifi)  corresponds 
to  a  two-dimensional  random  walk  that  is  biased  towards  (0, 0).  Lemmas  2.5.1 1  and  2.5.12 
below  provide  a  first  step  towards  formalizing  this  notion. 

Lemma  2.5.11  Let  i  be  in  [(logn)/6  —  1].  Given  sj  and  tj  for  all  j  in  [^]  such  that  5j_i  is 
at  least  3,  the  probability  that  $i  is  less  than  Si-i  is  at  least  1  -  Given  Sj  and  tj  for  all 
j  in  [i]  such  that  f-i  is  at  least  1,  the  probability  that  is  less  than  is  at  least  1  —  e^. 
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Figure  2.8:  A  read  request  for  object  A  is  forwarded  until  a  pointer  to  a  copy  of  A  is  found. 

Lemma  2.5.12  Let  i  be  in  [(logn)/6  —  1].  Given  Sj  and  tj  for  all  j  in  [?]  such  that  Sj_i  is 
at  most  3,  the  probability  that  si  is  0  is  at  least  1  —  e.  Given  Sj  and  tj  for  all  j  in  [?]  such 
that  ti-i  is  at  most  1,  the  probability  that  ti  is  0  is  at  least  \  —  e. 


In  order  to  establish  the  above  lemmas,  we  first  introduce  some  additional  notation.  For 
each  i  >  —1,  we  define  Si  and  Ti  as  follows.  Let  5’_i  =  T_i  =0  and  for  i  >  0  let 


5, 


r  5,  -1 

5,-1 
i  5,-1 


uj5,u(C,-n  A3(.t„a7)) 

U  Bi 

U5,_,,+2U  Ac(.r,_,,+3,A7) 


if  Si  G  {0, 1), 
if  5,  =  2, 

otherwise. 


r,- 


r,_iu£:,-  ift,  =  o, 

r,_i  U  Ei.i,  U  Nc  ( y,-f.+i ,  17 )  otherwise. 


The  set  Si  (resp.,  T,)  contains  all  of  the  nodes  whose  labels  need  to  be  examined  to 
determine  the  values  of  5o  through  s,  (resp.,  to  through  t,).  Moreover,  as  we  show  in 
Lenrnia  2.5.13,  the  particular  values  of  5o  through  5,  and  to  through  ti  bias  the  distribution 
of  only  a  suffix  of  the  labels  of  the  nodes  in  5,  U  T,  . 


Lemma  2.5.13  Leti  be  in  [(log??)/6  —  1].  Given  sj  and  tjforallj  in  [i],  we  have 


1.  The  variables  A  >0  ( u ),  for  all  u  not  in  Si  U  T„  are  independent  and  uniformly  random. 
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2.  There  exists  a  subset  SI  of  Si  of  size  at  most  d-\-\  such  that(i)  the  variables  X>i{u), 
for  all  u  in  SI,  are  independent  and  uniformly  random,  and  (ii)  for  each  node  u  in 
Si  \  SI,  V<i{u)  is  false. 

3.  There  exists  at  most  one  node  v  in  Ti  such  that  (i)  the  variable  A>j(t;)  is  uniformly 
random,  and  (ii)  for  each  node  uinTi  \  {u},  V<i{u)  is  false. 

Proof:  We  prove  Parts  1,  2,  and  3  for  all  i  >  —1.  The  proof  is  by  induction.  For  the 
induction  base  we  set  ?  =  —1.  Part  1  follows  directly  from  the  random  assignment  of 
labels.  For  Part  2,  we  set  S'_i  to  0,  and  the  desired  claim  holds  since  S-i  is  0.  The  claim  of 
Part  3  holds  vacuously  since  r_i  is  0. 

For  the  induction  hypothesis,  we  assume  that  Parts  1,  2,  and  3  of  the  lemma  hold  for 
i  —  We  first  consider  different  cases  depending  on  the  value  of  5,  . 

(a)  Sj  =  3  +  j,  j  €  [i]:  The  event  =  3  +  j  is  equivalent  to  the  event  {Bi-j-i  C 
Xl)  A  {Ai-j  g  Xl).  We  first  condition  on  the  event  Bi^j-i  C  X/  by  invoking 
Corollary  2.5.4.1  with  the  substitution  (Xf ,  V  \  (Si-i  U  T,_i),  i  -  j)  for  (S,  S',  i).  We 
next  condition  on  the  event  ^  Xf  by  invoking  Part  2  of  Lemma  2.5.4  with  the 
substitution  {xi-j,Xl,  V\{Si-iUTi-iUBi-j^i),V<i)  for(u,  S,  S',  V).  By  combining 
Part  (i)  of  both  invocations,  we  have  (a.i)  the  variables  A>o(t>),  for  all  v  not  in  Si-i  U 
Ti-i  U  Bi-j-i  U  Nc  {xi-j,X}),  are  independent  and  uniformly  random.  By  combining 
Part  (ii)  of  both  invocations,  we  have  (a.ii)  for  each  node  v  in  Bi-j-i  U  Ncixi-j,Xl), 
V<i(v)  is  false. 

We  set  S',  to  S',_,  \  U  Nc{xi.i,Xl)). 

(b)  =  2:  The  event  Si  =  2  is  equivalent  to  the  event  (5*  C  X/)  A  {Ai  2  Xf^).  We  first 

condition  on  the  event  Bi  by  invoking  Corollary  2.5.4. 1  with  the  substitution 

{Xl,V  \  [Si-i  U  Ti-i),  i)  for  {S,  S',  i).  It  follows  from  the  preceding  invocation  and 
the  definition  of  Bi  that  (b.i)  the  variables  A>o(v),  for  all  v  not  in  Si-i  U  U  Bi,  are 
independent  and  uniformly  random,  and  (b.ii)  for  each  node  v  in  Bi  \  {xj+i},  V<i{v) 
is  false. 


WcsctS'itoS'i_,\{B,\{xi+i}). 
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(c)  Si  G  {0, 1}:  The  event  Si  G  {0, 1}  is  equivalent  to  the  event  (5i  C  Xl)A{Ai  D  X~^). 
We  condition  on  the  event  Bi  C  Xf  by  invoking  Corollary  2.5.4  with  the  substitution 
(  A7  -  V  \  (6'i_i  U  Ti-i),  i)  for  {S,  S',  i).  It  follows  from  the  preceding  invocation  and 
the  definition  of  Bi  that  (i)  the  variables  A>o(t’),  for  all  v  not  in  Si-i  U  T-i  U  Bi,  are 
independent  and  uniformly  random,  and  (ii)  for  each  node  in  Bi  \  {xi+i },  'P<i(tO  is 
false. 

Let  S'  equal  the  set  {u  G  Ci  H  N^ia'i,  Xf):V<i{v)].  By  the  definition  of  Ci,  IS'-I  is  at 
most  (I  +  1.  If  Ci  2  then  Ci  C  N^ixi,  Xf)  and  it  follows  from  the  definition  of 
( that  (c.i)  the  variables  A>o(f ),  for  all  v  not  in  Si-i  U  Ti-i  U  Bi  U  are  independent 
and  uniformly  random,  and  (c.ii)  the  variables  A>,  (v),  for  all  v  in  S'-,  are  independent 
and  uniformly  random,  and  for  each  node  v  in  {Bi  U  C,)  \  S^,  V<i{v)  is  false.  If 
( D  Xf  then  C,  3  N^ixi,  Xf)  and  it  follows  from  Part  3  of  Lemma  2.5.4  that  (c.i) 
the  variables  A>o(u),  for  all  v  not  in  5'j_i  U  T,_i  U  Bi  U  Njixi,  Xf),  are  independent 
and  uniformly  random,  and  (c.ii)  the  variables  A>i(v),  for  all  v  in  S'^,  are  independent 
and  uniformly  random,  and  for  each  node  v  in  ( 5,  U  No  (xi,  Xf))\  S',  V<i(v)  is  false. 

We  thus  obtain  from  (a.i),  (b.i),  and  (c.i)  and  the  definition  of  5,  that  (i)  the  variables 
A>(,(  j/ ),  for  all  u  notin  S)  U  Tl-i,  are  independent  and  uniformly  random.  By  the  definitions 
of  .s,  and  ti,  the  particular  values  of  s,  and  are  independent  of  the  suffix  A>i(u)  of  any 
node  II.  In  particular,  the  variables  A>,(t/),  for  all  u  in  S-,  are  independent  and  uniformly 
random.  It  follows  from  the  preceding  observation  and  claims  (a.ii),  (b.ii),  and  (c.ii)  that 
(ii)  the  bits  of  A>;  (u),  for  all  ti  in  5'-,  are  independent  and  uniformly  random,  and  for  each 
node  in  5’,  \  S-,  P<i(n)  is  false.  We  next  consider  two  cases  depending  on  the  value  of  ti. 

(d)  /,  =  1  +  j,  j  e  [z'j:  This  case  is  similar  to  Case  (a).  The  event  =  1  +  j  is 

equivalent  to  the  event  C  >7)  (A-j  2  >7)-  We  first  condition  on  the 

event  Ei-j-i  C  by  invoking  Corollary  2.5.4.1  with  the  substitution  (17,  V  \  {Si  U 
T’,_i ),  i  —  j)  for  {S,  S',  i).  We  next  condition  on  the  event  Di-j  %  Yf-  by  invoking 
Part  2  of  Lemma  2.5.4  with  the  substitution  {y,-j,  17,  L  \  (5,  U  r,_i  U  Ei-j-i),V<i) 
for  {u,S,  S',  V). 

By  combining  Part  (i)  of  both  invocations,  we  have  (d.i)  the  variables  A>o(v),  for  all 
V  not  in  Si  U  Ti-i  U  A-j-i  U  Nq  {Vi-j ,  17 ),  independent  and  uniformly  random. 
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By  combining  Part  (ii)  of  both  invocations,  we  have  (d.ii)  for  each  node  v  in  U 

Nc(yi.j,  Y>),  V<i{v)  is  false. 

(e)  ti  =  0:  This  case  is  similar  to  Case  (b).  The  event  ti  =  2  is  equivalent  to  the  event 
Ei  C  We  invoke  Corollary  2.5.4.1  with  the  substitution  {Y^,V  \  (S',  U  r,_i),z) 
for  (S',  S',  i)  to  obtain  that  (e.i)  the  variables  A>o(i’),  for  all  t;  not  in  S',  U  T,_i  U  Ei,  are 
independent  and  uniformly  random,  and  (e.ii)  for  each  node  v  in  Ei  \  {m+i  },  V<i{v) 
is  false. 

To  complete  the  induction  step,  we  consider  each  part  of  the  statement  of  the  lemma  sepa¬ 
rately: 

1.  By  (i),  (d.i),  (e.i),  and  the  definition  of  T,,  it  follows  that  given  sj  and  tj,  j  e  [«],  the 
variables  A>o(u),  for  all  u  not  in  S',  U  Ti,  are  independent  and  uniformly  random. 

2.  'This  part  follows  directly  from  (ii)  above. 

3.  By  (d.ii)  and  (e.ii),  it  follows  that  given  arbitrary  values  for  sj  and  tj,  j  €  [i]  (i)  the 
variable  A>,  ( y,+i )  is  uniformly  random,  and  (ii)  for  each  node  u  in  T,-  \  {y,+i  },V<i{u) 
is  false. 

Q.E.D. 


Lemma  2.5.14  places  upper  bounds  on  the  sizes  of  S,  and  T,. 

Lemma  2.5.14  Let  i  be  a  nonnegative  integer.  If  Si  is  in  {0, 1},  S',  is  a  subset  of  Xf; 
otherwise,  Si  is  a  subset  of  X}.  The  set  T,  is  a  subset  ofYf. 

Proof:  The  proof  is  by  induction  on  i.  For  convenience,  we  set  i  =  —1  for  the  induction 
base.  Since  S'-!  =  r_i  =  0,  the  claims  follow  trivially.  Let  the  claims  of  the  lemma  hold 
for  S,_i  and  r,_i.  We  wiU  show  that  5,  C  X}.  The  proof  for  T  is  along  the  same  lines. 

By  the  induction  hypothesis.  S',-!  C  .  Since  7^  <  2*’  by  Equation  2.2,  XL,  £  X}, 
hence  implying  that  Sj_i  C  Xf.  We  now  consider  three  cases  depending  on  the  value  of 

Si» 
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If  Si  is  in  {0,1},  then  Bi  C  A"/.  Moreover,  by  Lemma  2.5.3,  N^ixi^Xf)  C 
A^(a;,  A7^20+i)*).  Since  A  <  7  by  Equation  2.4,  A(.t,  A7^20+i)*’)  C  Xf.  It  thus  fol¬ 
lows  that  Si-i,  Bi,  and  N2  {xi,  Xf)  are  all  subsets  of  Xf.  Thus,  Si  is  a  subset  of  Xf. 

If  Si  is  2,  then  Bi  C  Xf.  It  thus  follows  that  Si-i  and  Bi  are  both  subsets  of  Xf.  Thus, 
Si  is  a  subset  of  A^. 

If  Si  is  greater  than  2,  then  Bi-,,+2  Q  Xf.  Moreover,  Ac(.'i',-s,+3,  A"/)  is  a  subset  of 
Xf .  Thus,  Si  is  a  subset  of  A7 .  Q.E.D. 

The  following  lemma  is  used  in  the  proof  of  Lemma  2.5.1 1. 

Lemma  2.5.15  Let  i  be  in  [(log  n/b)  —  1].  Given  and  4  for  all  k  in  [i]  such  that  5,_i  is 
3+ j  for  some  j  in  [i  -f  1],  the  probability  that  is  a  subset  ofXf_^  is  at  least  1  —  e^/2. 

Given  Sk  and  4-  for  all  k  in  [?]  such  that  is  I  +  j  for  some  j  in  [i  -|-  1],  the  probability 
that  Ei^j^i  is  a  subset  ofY^_-^  is  at  least  1  —  e^/2. 

Proof:  Let  £  denote  the  event  that  the  2i  random  variables  s^  and  4-.  k  G  [i],  take  the 
given  values.  Let  us  assume  that  S  holds.  We  begin  with  the  proof  of  the  first  claim.  Since 
Si-i  is  3  -|-  j,  Bi-j-i  is  not  a  subset  of  A7_i ,  and  is  a  subset  of  A7_i. 

By  Part  1  of  Lemma  2.5.13,  it  follows  that  given  £,  the  variables  A>o(w),  for  all  u  not 
in  Si-i  U  4-1,  are  independent  and  uniformly  random.  By  Lemma  2.5.14,  jS',-!  U  4_i  | 
is  at  most  72*^+^  By  Lemma  2.5.3,  since  7  >  A^  by  Equation  2.4,  ng  is  at 

least  7^2*7 A.  Therefore,  the  probability  that  is  not  a  subset  of  Nc{xi-j-\,Xf_-^) 

is  at  most 


(1  -  <  g-b4A-27)2J'’ 

<  sV2- 

(The  second  step  makes  use  of  the  following  inequalities:  (i)  7  >  4  A,  which  is  obtained 
from  Equation  2.4,  and  (ii)  12  <  £^/2,  which  is  obtained  from  Equa¬ 

tion  2.5.)  Since  Nc{xi-j-i ,  A7_i )  is  a  subset  of  A7_i ,  the  probability  that  Ai-j-i  is  not 
a  subset  of  Xf_^  is  at  least  1  —  12.  Since  4,_j_2  is  a  subset  of  A7_i  C  A7_i  and 
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Bi-j-i  =  5,-j_2  U  Ai-j-i,  we  obtain  that  is  a  subset  of  Xf_i  with  probability  at 

least  1  —  e^/2. 

The  proof  of  the  second  claim  is  analogous  to  the  above  proof  and  is  obtained  by  sub¬ 
stituting  {t,  D,  E,  y,  Y)  for  (s,  A,  B,  x,  X).  Q.E.D. 

We  are  now  ready  to  prove  Lemmas  2.5.1 1  and  2.5.12. 

Proof  of  Lemma  2.5.11:  Let  £  denote  the  event  that  the  2i  random  variables  Sj  and  tj, 
j  €  [i],  take  the  given  values.  Let  us  assume  that  £  holds.  We  begin  with  the  proof  of  the 
first  claim.  Let  be  3  -|-  j  for  some  j  in  [i].  Thus,  is  not  a  subset  of  and 

Bi-j-2  is  a  subset  of  Xl_^. 

We  show  that  Bi-j  is  a  subset  of  Xf  with  probability  at  least  1  -  We  first  invoke 
Lemma  2.5.15  to  obtain  that  (a)  Bi-j^i  is  a  subset  of  with  probability  at  least  1— e^/2. 
Let  us  now  assume  that  £  and  the  event  that  Bi-j-i  is  a  subset  of  Xf_-i  hold. 

We  now  show  (b)  the  probability  that  Bi-j  is  a  subset  of  X}  is  at  least  1  -  e^/2.  It 
follows  from  Part  1  of  Lemma  2.5.13  that  given  £  the  variables  A>o(u),  for  all  u  not  in 
S'i-i  UTi-i,  are  independent  and  uniformly  random.  Thus,  given  £  and  the  event  that  Bi-j-i 
is  a  subset  of  Xf_-^,  the  variables  A>o  (u),  for  all  u  not  in  Si-i  U  Tj_i  U  Xf_i,  are  independent 
and  uniformly  random.  By  Lemma  2.5.14,  Si-i  C  X/_i  and  Ti-i  C  j.  Therefore,  the 
size  of  the  set  ^j-i  U  Ti-i  U  Xf_i  is  at  most  7(7  -1- 1)2'^.  By  Lemma  2.5.3,  since  7  > 
by  Equation  2.4,  nc{xi-j,Xl)  is  at  least  72(*+^)*’/A.  Therefore,  the  probability  that  Ai-j 
is  not  a  subset  of  Nq  {xi-j,  Xf)  is  at  most 

<  eV2. 

(The  second  step  makes  use  of  the  following  inequalities:  (i)  2  A  (7  -|- 1)  <  A^  7"  <  2', 
which  is  obtained  from  Equation  2.2,  and  (ii)  <  (4e”^/^^)^/2  <  e^/2,  which  is 

obtained  from  Equation  2.5.)  Thus,  the  probability  that  Bi-j  is  not  a  subset  of  Xf  is  at 
moste^/2. 

It  follows  from  (a)  and  (b)  above  that  with  probability  at  least  (1  -  e^),  s*  is  less  than 
s,_i,  thus  establishing  the  first  claim  of  the  lemma.  The  proof  of  the  second  claim  is  anal- 
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ogous  to  the  above  proof  and  is  obtained  by  substituting  {t,  Z),  E,  y,  Y)  for  (5,  A,  B,  x,  A'^). 

Q.E.D. 

Proof  of  Lemma  2.5.12:  Let  €  denote  the  event  that  the  random  variables  Sj,  tj,  j  €  [?], 
take  the  given  values.  Let  us  assume  that  £  holds.  We  begin  with  the  proof  of  the  first 
claim.  If  Sj_i  is  in  {0, 1,2},  Bi-i  is  a  subset  of  A7_i .  If  ^i-i  is  3,  then  by  Lemma  2.5.15, 
Bi-i  is  a  subset  of  Xf_i  with  probability  at  least  1  —  e^/2.  We  now  assume  that  is  a 
subset  of  A7_i. 

We  first  show  that  (a)  the  probability  that  Bi  is  a  subset  of  A"/  is  at  least  1  —  s/3  +  s^/2. 
By  Part  1  of  Lemma  2.5.13,  it  follows  that  given  £  the  variables  A>o(t/),  for  all  nodes  u  not 
in  Si-i  U  r,_i,  are  independent  and  uniformly  random.  By  Lemma  2.5.14,  \Si-i  U  Tj-i  | 
is  at  most  7^2'^+L  By  Lemma  2.5.3,  since  .t,  is  in  A/_j  and  2*"  >  A^7  (by  Equation  2.2), 
nc{xi,Xl)  is  at  least  72**+')^/ A.  Therefore,  the  probabihty  that  Ai  is  not  a  subset  of 
Nc(xi,  Xf )  is  at  most 

(1  -  <-  g-(7/A-27V2'’) 

<  s/3  —  s^/2. 

(The  second  inequality  follows  from  the  inequality  47^  A  <  2**,  which  is  obtained  from 
Equation  2.2.  The  last  inequahty  follows  from  the  inequalities  (i)  <  s/4,  which  is 

obtained  from  Equation  2.5,  and  (ii)  s/4  <  s/3  —  s^/2,  which  holds  since  s  <  1/10  by 
Equation  2.6.)  This  implies  that  the  probabihty  that  Ai  is  not  a  subset  of  Xf  is  at  most 

s/3-sV2. 

We  next  show  that  (b)  the  probabihty  that  A,  is  a  superset  of  XA  is  at  least  1  —  s/3. 
Since  A^7^  <  2**  by  Equation  2.2,  Lemma  2.5.3  imphes  that  nj{xi,X~^)  <  A2^'''''^^/7. 
By  Lemma  2.5.13,  we  have  (i)  the  variables  A>o(u),  for  all  u  not  in  5,_i  ur,_i ,  are  indepen¬ 
dent  and  uniformly  random,  and  (ii)  there  are  at  most  d+1  nodes  in  U  T,_i  for  which 
the  predicate  'P<,  holds.  Therefore,  the  probabihty  that  A,  is  a  subset  of  No{xi,  Xf^)  is  at 
most  (c?  A  l)/2*’  +  A/7,  which  is  at  most  s/3.  It  follows  that  the  probabihty  that  A,  is  not 
a  superset  of  A"^"*  is  at  most  s /3. 
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We  finally  show  that  (c)  given  that  Bi  is  a  subset  of  X}  and  Ai  is  a  superset  of  X~^ ,  the 
probability  that  (7,-  is  a  superset  of  Xf  is  at  least  1  -  s/3.  We  note  that  given  S,  and  the  two 
events  Bi  is  a  subset  of  Xf  and  Ai  is  a  superset  of  X~^,  (i)  the  variables  A>o(u),  for  all  u 
not  in  Si-1  U  Tj-i  U  Xf ,  are  independent  and  uniformly  random,  and  (ii)  there  exist  at  most 
d+1  nodes  in  Si-i  U  Ti-i  for  which  the  predicate  V<i  holds. 

We  will  place  an  upper  bound  on  the  probability  that  Ci  is  not  a  superset  of  Xfhy 
placing  a  lower  bound  on  the  probability  that  Ci  is  not  a  superset  of  N^ixi,  Xf),  which  is 
a  superset  of  X/.  Let  ro  (resp.,  rj)  denote  n2(a;i,Xf^)  (resp.,  n2{xi,  Xf)).  By  definition, 
njixi,  Xf^)  is  at  least  By  Lemma  2.5.3,  n^ixi,  Xf)  is  at  most 

We  first  show  that  the  nodes  in  N2  (a;,,  Xf)  are  within  a  cost  of  d  ■  c{xi,  We  note 
that  c{xi,Xi+i)  is  at  least  the  difference  of  the  radii  of  X~^  and  Xf_i.  Moreover,  since 
Noixi,  Xf)  is  a  subset  of  Af,  n2{xi,Xf)  is  at  most  the  sum  of  the  radii  of  Xf  and  Xf_i. 
Since  (47^)’"®*^  <  7^  <  by  Equation  2.3,  all  of  the  nodes  in  N2{xi,Xf)  are  within  a 
cost  of  d  •  c{xi,  Xijfi)  from  Xi. 

It  now  follows  that  the  probability  that  Ci  is  not  a  superset  of  Xf  is  at  most  the  proba¬ 
bility  that  there  exist  d  nodes  in  N2  {xi,  Xf)  whose  {i  +  1)6  rightmost  bits  match  a  certain 
bit-sequence.  This  probability  is  at  most 

<  e/3. 

(The  second  step  follows  from  the  inequalities  (2e/2*’)‘^/^  <  e/6  and  (eA'j^/d)^  <  e/6, 
both  of  which  are  derived  from  Equation  2.5.) 

It  follows  from  (a),  (b),  and  (c)  above  that  with  probability  at  least  1  —  e,  s,  is  0,  thus 
establishing  the  first  claim  of  the  lemma.  The  proof  of  the  second  claim  is  analogous  to  the 
proof  of  (a)  and  is  obtained  by  substituting  (t,  D,  E,  y,  F)  for  (5,  A,  B,  x,  X). 

Q.E.D. 

By  the  definitions  of  Si  and  ti,  it  follows  that  0  <  <  3  if  s,  <  2,  and  0  <  5^+1  < 

Si  +  1  otherwise.  In  addition,  0  <  <  U  +  1,  for  all  i.  Let  s'-  equal  0  if  Si  =  0,  1  if 

Si  €  {1,2,3},  and  Si  -  2  otherwise.  Hence  0  <  max{s'.,.i,fi+i}  <  ma,x{s'i,ti}  +  1,  for 
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all  i.  In  Section  2.5.3.3  below,  we  analyze  the  random  walk  corresponding  to  the  sequence 

(max{s',  t}). 

2.5.3  J  Random  walks 

We  begin  the  analysis  of  the  random  walk  corresponding  to  the  sequence  (max{s',  t}) 
by  proving  several  useful  properties  of  certain  random  walks  on  a  line.  These  properties 
are  stated  in  Lemmas  2.5.16  through  2.5.21.  The  main  technical  claim  of  this  section  is 
Lemma  2.5.23. 

Let  W[U^  F)  be  a  directed  graph  in  which  U  is  the  set  of  nodes  and  F  is  the  set  of 
edges.  For  all  u  in  U,  let  be  a  probability  distribution  over  the  set  {(u,  u)  G  F}.  We 
define  Pri>„  [(«,?;) :  {u,v)  ^  F]  =  0,  for  convenience.  A  random  walk  on  W  starting  at 
Vo  and  according  to  ;  w  G  U}  is  a  random  sequence  {v)  such  that  (i)  Vi  is  in  U  and 
(uj,  Vi+i)  is  in  F,  for  all  i,  and  (ii)  given  any  fixed  (not  necessarily  simple)  path  wo,  •  •  • ,  Ui 
in  W  and  any  fixed  in  U,  Pr[t»,+i  =  Ui+i  |  (vo,  •  •  • ,  =  (wo,  •  •  • ,  w.)]  =  Pr[vi+i  = 

I  Vi  =  Ui]  =  Prz)„J(Ui,U,+i)]. 

Let  H  be  the  directed  graph  with  node  set  N  (the  set  of  nonnegative  integers)  and  edge 
set  {{i,j) :  i  G  N,  0  <  j  <  ?  +  1}.  Let  H'  be  the  subgraph  of  H  induced  by  the  edges 
{(*  +  I,*)?  ?  +  1) :  *  €  N)  U  {(0,0),  (1, 1)}. 

Let  p  and  q  be  reals  in  (0, 1]  such  that  p  >  q.  We  now  define  two  random  walks,  cop^g 
and  uj'p  g,  on  graphs  H  and  H',  respectively.  The  walk  u>p^g  =  (u>)  is  characterized  by  (i) 
Pr[u;,+i  <j-l\wi=  j]  >  p,  for  any  integer  j  >  1,  (ii)  Pr[u;,+i  =  0  |  it;,-  =  j]  >  q,  for  j 
equal  0  or  1,  and  (iii)  Pr[z/;,+j  =  2  |  =  1]  <  1  — p.  The  walku;'  ^  =  {w')  is  characterized 

by  (i)  Pr[u;'+i  =  j  -  1  \  w'i  =  j]  =  p,  for  all  integer  j  >  1,  (ii)  Pr[u7^+i  =  0  |  w'  =  j]  =  g, 
for  j  equal  0  or  1,  and  (iii)  Pr[w;-^.j  =  2  |  •  =  1]  =  1  —  p.  We  note  that  Lemmas  2.5.1 1 

and  2.5.12  imply  that  the  sequence  (max{s',  t})  can  be  represented  by  the  random  walk 
LOp^q  with  p  =  1  —  2e^  and  q  =  I  —  2e. 

We  analyze  random  walk  uip^g  by  first  showing  that  LVp^g  is  more  “favorable”  than  w'  ^ 
with  respect  to  the  properties  of  interest.  The  random  walk  Wp  ^  is  easier  to  analyze  as  it 
is  exactly  characterized  by  p  and  q.  Lemmas  2.5.16  and  2.5.18  show  that  the  bias  of  Wp ,, 
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towards  0  is  more  than  that  of  Since  the  values  of  p  and  q  are  fixed  throughout  the 
following  discussion,  we  omit  the  subscript  p,  q  in  the  terms  ojp^q  and  w'  ^  for  convenience. 

Lemma  2^.16  For  all  i  and  k  in  N,  for  random  walks  u;  and  uj',  we  have  Pr[wi  <  A;]  > 
Pr[u''  <  k]. 

Proof:  We  prove  the  claim  by  induction  on  i.  The  base  case  z  =  0  is  trivial.  Assume  the 
claim  holds  for  /  and  any  fc.  If  A;  >  1,  then  we  have 

Pr[»’-+i  <  fc]  =  Prfw^  <  A;  —  1]  +  pPr[A:  <w\<k-\-\] 

=  {I  —  p)  Pr[w-  <  A;  —  1]  +  pPr[u;,-  <  A  +  1],  and 
Pr[  u'i+i  <  A]  >  Pr[u;;  <  A  —  1]  +  pPr[A  <  <  A  +  1] 

=  (1  —  p)  Pr[tyi  <  A  —  1]  H- pPr[wi  <  A  +  1]. 

If  A  =  1 ,  then  we  have 

Pr[u;'^i  <  0]  =  gPr[tt’'  <  0]  +  ^Prfu;^  =  1] 

=  9Pr[iy'  <  1],  and 

Pr[tt)j+i  <  0]  >  gPrfiWi  <  0]  +  ^PrftOj  =  1] 

=  qPr[wi  <  1], 

The  lenuna  now  follows  by  induction.  Q.E.D. 

We  now  establish  a  probabilistic  relationship  between  the  number  of  steps  it  takes  for 
the  random  walks  u;  and  u'  to  reach  node  0  starting  from  a  given  node  i.  Let  zfa)  be  the 
random  variable  denoting  the  number  of  steps  taken  to  reach  node  0  starting  from  node  i, 
for  a  random  walk  cr. 

Lemma  2.5.17  For  all  i  and  all  i  >  0,  we  have  PT[zi(uj')  <  £]  <  Pr[2i_i(ci;')  <  i], 

Proof:  We  use  induction  on  i.  The  base  case  f  =  0  is  trivial.  Let  ^  >  1.  If  ?  >  2  then 

Pr[zi_i(a;')  <  1]  =  pPr[zi^2{^')  <  f  —  1]  +  (1  -  p)  Pr[^i(L<;')  <  ^  —  1] 

>  pPr[2i_i(t<;')  <  f  -  1]  +  (1  -  p)  Pr[^i+i(u;')  <  i  -  1] 

=  PvlZiiio')  <  i]. 
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where  the  second  step  follows  from  the  induction  hypothesis.  If  i  =  2,  then  we  have 

Pr[5ri(a;')  <  i]  =  q  +  (1  -  q  -  (1  -  p))  Pr[2i(a;')  <  f  -  1]  +  (1  -  p)  PT[z2{io')  <  f  -  1] 

>  pPr[zi(w')  <  f  -  1]  +  (1  -  p)  Pv[z2iio')  <  f  -  1] 

>  pPr[^i(t^')  <  ^  -  1]  +  (1  -  p)  Pr[23(u;')  <  f  -  1] 

=  Pr[z2(cu')  <  f], 

where  the  second  step  follows  from  the  induction  hypothesis.  If  ^  =  1  then  we  have 

Pr[.2o(t<^')  <  ^]  =  9  +  (1  -  9)  Pr[2i(u;')  <  f  -  1] 

>  q  Pr[2ro(t^')  <  f  -  1]  +  (p  -  9)  Pr [2:1  (cu')  <  f  -  1]  + 

+  (1  -  p)  Pr[xr2(<^')  <  f  -  1] 

=  Pr[^i(a;')  <  f], 

where  the  second  step  follows  from  the  induction  hypothesis.  Q.E.D. 

We  now  use  Lemma  2.5.17  to  argue  that  the  random  variable  Zi{u))  is  stochastically 
dominated  by  the  random  variable  Zi{u;'). 

Lemma  2.5.18  For  all  i  and  I  in  N,  we  have  Pi'[s,(u;)  <  f]  >  Pr[~i(a;')  <  (\. 

Proof:  The  proof  is  by  induction  on  L  The  base  case  f  =  0  is  trivial.  Let  pj  =  Pr[tyi+i  < 
j  —  \  \wi  =  j],  for  i  >  1,  and  qj  =  Pr  [«!,+]  =  j  \  uq  =  j],  for  all  j  in  N.  Note  that  the 
following  inequalities  hold:  (i)  p  <Pj,  for  all  j  >  1,  (ii)  q  <  min{pi,  90},  and  (iii)  p>  q. 

If  i  >  2,  then  we  have 

Pr\zi{Lo')  <  t]  :=  pPi[zi-i{J)  <  f  -  1]  +  (1  -  p)  Pr[x:,+i(w')  <  f  -  1] 

<  p,  Pr[2,_i(a;')  <  £  -  1]  +  (I  -  p^)  Pr[^i+i(a;')  <  f  -  1] 

—  Pi  ^  ^  ~  i]  "f  9;  ^  ^  ~  1]  + 

+  (1  -pi  -  gi)Pr[^,+i(a;')  <  f  -  1] 

<  p,  Pv[zi^i{uj)  <£-l]+qi  Pr[si(u;)  <  f  -  1]  + 

+(1  -  Pi  -  qi)  Pr[.:,+i(u;)  <  f  -  1] 

=  Pr[2i(u;)  <  £]. 
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(The  second  step  holds  because  (i)p  <  pi,  and  (ii)  Pr[2:i_i(u;')  <  ^  -  1]  >  Pr[zi+i(u;')  < 
i  -  1],  which  follows  from  Lemma  2.5.17.  The  third  step  holds  since  Vx[zi{(jj')  <  I  - 
1]  >  Pr[2;i+i(u;')  <  ^  —  1]  by  Lenuna  2.5.17.  The  fourth  step  follows  from  the  induction 
hypothesis.)  For  i  =  1,  we  have 

Pr[^i(u;')  <(]=  q+{p-  q)  Pr[2:i(u;')  <  £  -  1]  +  (1  -  p)  Pr[^r2(<^')  <  ^  -  1] 

<Pi+qi  Pr[2;i(w')  <  ^  -  1]  +  (1  -  Pi  -  9i)  Vx[z2{uj')  <t-l] 
qi  Pr[2:i(a;)  <  ^  -  1]  +  (1  -  pi  -  ^i)  Vx[z2{uj)  <  £  -  1] 

=  Pr[^i(u;)  <  ^]. 

(The  second  step  holds  because  (i)  ^  <  pi,  (ii)  1  -  p  >  1  -  pi  -  §i,  and  (iii)  Pr[0i(a;')  < 
^  —  1]  >  Vx[z2{uj')  <  ^  —  1],  which  follows  from  Lemma  2.5.17.  The  third  step  follows 
from  the  induction  hypothesis.) 

For  i  =  0,  we  have  ' 

Pr[2:o(t<^')  <  ^]  =  7  +  (1  -  ?)  Pr[2ri(a;')  <  ^  -  1] 

<qo  +  {l-  qo)  Pr[2;i(w')  <  ^  -  1] 

<  ^0  +  (1  -  qo)  Pr[2i(a;)  <  £  -  1] 

=  Pr[2o(t<^)  <  wi¬ 
enie  second  step  holds  because  q  <  qo-  The  third  step  follows  from  the  induction  hypoth¬ 
esis.)  We  have  thus  established  the  desired  claim.  Q.E.D. 

We  now  show  that,  in  a  probabilistic  sense,  the  time  to  return  to  0  is  smaller  for  u> 
than  for  a;'.  For  any  i,  let  r,  (resp.,  r/)  denote  the  smallest  j  >  0  such  that  Wi+j  =  0 
(resp.,  w<^j  =  0).  We  note  that  by  letting  (w)  represent  (max{5',  t}),  the  terminating 
step  T  is  given  by  z*  -I-  r,..  Lemma  2.5.20  shows  that,  for  any  i,  the  random  variable  r,  is 
stochastically  dominated  by  the  random  variable  r/.  We  first  prove  the  following  technical 
lenuna: 

Lemma  2^.19  Let  m  be  a  nonnegative  integer  and  let  (n)  be  a  sequence  ofm  nonincreas¬ 
ing  reals.  Let  (p)  and  (q)  be  two  sequences  ofm  reals  each  such  that  (i)for  all  j  in  [m], 
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Eo<Ki  Pi  -  Eo<Ki  Qi  (ii)  Eo<i<m  Pi  =  Eo<K7n  Then,  we  have: 

0<t<m  0<2<m 

Proof:  The  proof  is  by  induction  on  m.  The  induction  basis  is  trivial.  For  the  induction 
hypothesis,  we  assume  that  the  statement  of  the  lemma  holds  for  m.  We  now  establish  the 
claim  for  +  I . 

X]  Pi^i  =  90^0  +  (po  -  qo)no  +  ^  PiTli 

>  qono  +  (po  -  9o  +  Pi )»?!  +  XI  Pi'^'^i 

2<?<  771  +  1 

>  qono  +  XI 

l<t<m  +  l 

=  ^i^i- 

0<i<m 

(The  third  step  follows  from  the  induction  hypothesis  and  the  inequalities  no  >  rii  and 
Po  >  qo-  We  note  that  the  induction  hypothesis  can  be  invoked  since  po  —  9o  +  Pi  + 

Y.2<i<jPi  <  El</<J  qi  andpo  —  go  +  Pi  +  E2</<»7!  +  l  =  El<Krn+l  qi-)  Q.E.D. 

Lemma  2.5.20  For  any  i  and  j  >  i,  we  have  Pr[  T,  <  j]  >  Pr[r/  <  j]. 

Proof:  The  desired  claim  follows  from  the  following  argument: 

Pr[ri  <j]=  Y,  <  j] 

0<k<i 

>  j:  Prl..',  =  *'|  Prl.-,(u.')  <  il 

0<A'<2 

>  i:  p.[«''=JiPrMu.')<i] 

0<k<i 

=  p'l'K  <  i). 

(The  first  step  follows  from  the  definitions  of  Ti  and  Zi(u;).  For  the  second  step,  we 
use  Lemma  2.5.18.  For  the  third  step  we  first  invoke  Lemma  2.5.16  and  then  invoke 
Lemma  2.5.19  with  the  substitution  (i,  A;,  Pi[u-,  =  A-],Pr[ic-  =  A+ Pr[2rA-(u;')  <  j])  for 
{m,i,pi,qi,ni).  We  note  that  one  of  the  conditions  for  the  latter  invocation,  namely, 
Pv[zk{(jo')  <  j]  is  nonincreasing  with  k,  follows  from  Lemma  2.5.17.  The  fourth  step 
follows  from  the  definitions  of  r/  and  Ziiu').)  Q.E.D. 
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Lenuna  2.5.20  indicates  that  we  can  obtain  an  upper  bound  on  the  time  taken  for  the 
random  walk  lo  to  return  to  0  by  deriving  a  corresponding  bound  for  the  random  walk 
oj'.  Indeed,  we  will  use  Lemma  2.5.20  to  obtain  an  upper  bound  on  the  length  of  any 
“excursion”  in  w.  An  excursion  of  length  i  in  a  graph  W  witii  node  set  N  is  a  walk  that 
starts  at  node  0  and  first  returns  to  the  start  node  at  time  £,  for  all  £  in  N.  For  all  i  such  that 
Wi  =  0,  let  be  the  random  variable  that  gives  the  length  of  the  excursion  in  u  starting 
at  time  i.  We  note  that  for  all  i,  £i{u})  equals 

The  following  lemma,  which  describes  a  probabihstic  recurrence  relation  for  the  length 
of  an  excursion  in  u',  is  proved  using  a  classical  combinatorial  result  known  as  Raney’s 
lemma  [17, 46]. 

Lemma  2^.21  Let  p  and  q  sati^  the  inequality  \  —p<  {p  —  qY-  For  all  i  and  £  in  N,  we 
have  'Pr[£i{(jo')  =  £  -|-  1  |  =  0]  <  max{l  —  q,  5(p  —  g)}  Pr[^i(a;')  =  ^  |  =  0]. 

Proof:  Since  u'  is  a  random  walk, 

Pr[A(a)')  =  £  I  (u^o, . . . ,  =  (uq,  . . . ,  u;_i,  0)]  =  Pr[4(<^')  =  £  j  Wq  =  0] 

for  any  uq,..  «,-i  in  N.  For  the  remainder  of  the  proof,  we  assume  without  loss  of 

generality  that  i  is  0. 

For  £  =  1,  the  desired  claim  holds  since  Pr[4(w0  =  2]/  Pr[4(c^0  =  1]  =  (1  —  q).  We 
now  consider  £>  2.  Let  Sj  denote  the  event  that  the  random  walk  does  not  reach  node  0  in 
thefirstj  steps.  Thatis,^  is  the  event  that  is  nonzero  for  all  A:  in  [l,i].  Forall  j,  letaj 
denote  the  probabihty  that  is  1  and  ^i+i  holds,  given  that  w[  is  1.  For  convenience, 
we  assume  that  a_i  equals  l/(p  —  q).  We  obtain  that; 


Pr[4(c<^')  =  £]  =  (1  -  q)  ■  a^-2  ■  q.  (2.8) 

It  thus  follows  that  the  ratio  of  Pr[4(w')  =  £-\-l]  and  Pr[4(<^0  =  £]  equals  /a^_2. 
The  remainder  of  the  proof  is  devoted  to  obtaining  an  upper  bound  on  Oj+i  /  aj  for  all  j  >  0. 

Let  Yrn  denote  the  probability  that  the  following  conditions  hold  given  that  tWj  =  1:  (i) 
£’2m+i  holds,  (ii)  ii?2m+i  =  (Ij  1)  is  not  traversed  in  any  of  the  first 
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2m  +  1  steps.  Using  Raney’s  lemma  [17, 46],  we  have  /?„  =  2^ 
the  definitions  of  aj  and  Prn,  it  follows  that: 


0<m<[j/2j 


=  E 

0<m<[j/2j 


1 

2m  +  1 


(p(l  -p)r  ■ip-q)-<^]-2m-l. 


We  now  prove  by  induction  on  j  >  2  that  a^+i /a^  is  at  most  5(p  -  q).  The  induction 
base  holds  since  oq  is  1  and  ai  is  5{p  —  q).  For  the  induction  hypothesis,  we  assume  that 
O  j+i/(\  I  is  at  most  5(p  —  q)  for  all  j  <  k  —  1.  If  k  is  even,  then  we  have: 


O-k+xIOik  < 


max 

0<m<k/2 


^k—2ml^k-2m  —  l 


<  Hp-q), 


where  the  last  step  follows  from  the  induction  hypothesis.  If  k  is  odd,  then  qa+i/qa-  is  at 
most 


(I  k  ' 
max  <  —  , 

\\k[ik-l)/2, 


iP  -  q?  + 


k  +  2 


k  2  \{k  +  1)/ 


(.<..<?-3)/2j  ^k-2mlok-2m-i  |  <  max{5(p  -  q),  5(p  -  9)}  =  5(p  -  g), 


where  the  second  step  follows  from  the  induction  hypothesis  along  with  the  inequalities 
J  —  P  ^  (p  —  q)^  and 


^  k  +  2  ' 
^{k  -f  l)/2^ 


<  4 


'  k  ^ 


The  claim  of  the  lemma  follows  from  the  upper  bound  on  q  a+i  /q  a  and  Equation  2.8. 

Q.E.D. 


We  are  now  ready  to  use  the  properties  of  the  random  walks  u  and  ix>'  that  are  stated  in 
Lemmas  2.5.20  and  2.5.21  to  analyze  the  random  walk  obtained  by  the  sequence  (max{s',  t}) 
We  set  p  =  1  -  2e^  and  q=l-2e.  Lemmas  2.5.1 1  and  2.5.12  imply  that  ui  characterizes 
the  random  walk  corresponding  to  the  sequence  (maxj^',  t}).  Consider  the  random  walk 
ijj' .  We  define  a  sequence  {v)  associated  with  {w')  as  follows:  If  w'-  =  0  then  Vj  =  G; 
otherwise,  Vj  =  B. 
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Lemma  2.5.22  Let  i  be  in  [(log  nib)  —  1].  Given  any  fixed  sequence  {v)i-i  ofB,  G  values, 
the  probability  that  w\  is  0  is  at  least  1  —  lOe.  Q.E.D. 

Proof:  Assume  that  Vj  =  G.  What  is  the  probability  that  vi  =  G,  i  >  j,  if  we  know  that 
Vk  =  B,  for  all  integers  k  in  the  interval  [j  +  1,  i)?  From  Lemma  2.5.21,  it  follows  that  this 
probability  is  at  least  1  —  lOe.  This  is  because  Lemma  2.5.21  states  that  1  —  max{2e,  10(e  — 
£^)}  is  a  lower  bound  on  the  probability  that  there  is  an  excursion  of  length  i  —  j  starting 
at  j  in  H',  given  that  there  is  an  excursion  of  length  at  least  i  —  j  starting  at  j  in  H'.  (Note 
that  I  —p  <{p  —  qY  since  e  <  1/10  by  Equation  2.6.) 

Given  any  fixed  B,  G  sequence  we  have  that  Pr[vi  =  G  \  {vq,  Vi-i)  = 

=  {uo,...,Uj.i,G,B,...,B)]  is  equal  to  Pr[ui  =  G  |  (vj, . . .  ,Ui_i)  =  {G,B,...,B)]. 
Since  this  holds  for  any  j  >  0  and  since  =  0  if  and  only  if  u,  =  G,  we  have 
Pr[«;-  I  (uo, . . . ,  Ui_i)  =  (uo, . . . ,  Ui-i)]  >  1  -  lOe.  Q.E.D. 

Our  main  technical  claim  concerning  the  random  walk  lv  now  follows  from  Lem¬ 
mas  2.5.20  and  2.5.22. 

Lemma  2.5.23  For  any  i  in  [(log  n)/h  —  1]  and  any  nonnegative  integer  j,  the  probability 
that  Ti  >  j  is  at  most  (lOe)-’. 

Proof:  By  Lemma  2.5.22,  the  probability  that  r-  is  at  least  j  is  at  most  (lOe)^ .  The  desired 
claim  then  follows  from  Lemma  2.5 .20.  Q.E.D. 

2.5.3.4  Proofs  of  Theorems  1  and  2 

We  first  derive  upper  bounds  on  E[c{xi,  a;,+i)]  and  E[c{yi,  j/i+i)],  for  all  i,  using  Lemma 
2.5.23.  Recall  that  Uj  and  bi  denote  the  radii  of  Xf  and  Y^-,  respectively,  and  i*  is  the 
smallest  integer  i  such  that  a,  is  at  least  c{x,  y). 


Lemma  2.5.24  For  any  i  in  [{\ogn)/b-  1],  E[c{xi,Xi+i)]  =  0{ai)  and  E[c{yi,yi+i)]  = 
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Proof:  We  first  observe  that  c(xi,  Xi+i )  (resp.,  c{yi,  t/,+i ))  is  at  most  (resp.,  bk),  where 
k  is  the  least  j  >  i  such  that  sj  (resp.,  tj)  belongs  to  {0, 1,2}  (resp.,  {0});  if  such  a  j  does 
not  exist,  then  k  is  (log  n)/b  —  1.  Thus,  k  is  at  most  i  +  r,.  By  Lemma  2.5.23,  it  follows 
that  for  any  j  >  i,  the  probability  that  k  >  j  is  at  most  (lOt)^"'.  By  Lemma  2.5.9,  we  thus 
have 

and 

E[c{y^.yi+,)]  <  ^ 

i>^ 

since  <  1  by  Equation  2.6.  Q.E.D. 

We  now  use  Lemmas  2.5.9,  2.5.23,  and  2.5.24  to  estabhsh  Theorem  1. 

Proof  of  Theorem  1:  By  Equation  2.7,  the  expected  cost  of  the  read  operation  is  bounded 
by  the  expected  value  of  f{i{A))Zo<i<TO{c{xi,Xi+i)  +  c(y„y;+i)).  (Recall  that  r  is 
the  smallest  integer  i  >  i*  such  that  (s,,  ^,)  =  (0,0).)  We  upper  bound  the  two  terms 
£^[Eo<t<i*(c(a;i,3^i+i)  +  c(yi,?/,+i))]  and  E[J2i><i<ric{xi^^i+\)  +  c(y,-,yi+i))]  separately. 
By  Lemmas  2.5.9  and  2.5.24,  the  first  term  is  0{ai»  +  6j»).  We  upper  bound  the  second 
term  as  follows.  Since  r  is  i*  +  t,.,  we  obtain  from  Lemma  2.5.23  that  for  any  j  >  0,  the 
probability  that  r  >  i*  +  j  is  at  most  (lOt)-’  .  Therefore, 

E[  Y.  (c(a’i,a;,+i) +  c(y„y,+i))]  <  ^  j(10£)-'(o,*+j  + 

i*<i<T  j>0 

j>0 

~  0{oi*  “h  bi* ) 

=  0{c{x,y)). 

(The  second  step  follows  from  Lemma  2.5.9.  The  third  step  holds  since  10e2l^*’*°s^  <  1 

by  Equation  2.6.  The  fourth  step  follows  from  Lemma  2.5.9.)  Q.E.D. 

Theorem  2  follows  from  Lemmas  2.5.6,  2.5.9,  and  2.5.24. 
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Proof  of  Theorem  2:  Consider  an  insert  operation  performed  by  x  for  any  object.  The 
expected  cost  of  the  operation  is  bounded  by  £^[Eo<i<iogn/i.c(a;j,a;i+i)],  which  by  Lem¬ 
mas  2.5.9  and  2.5.24  is  0(a(iog„)/6_i)  =  0(C). 

We  now  consider  the  cost  of  the  delete  operation.  By  Lenuna  2.5.6,  for  each  i,  the 
number  of  reverse  (z  ,  j) -neighbors  of  Xi  for  any  j  is  C>(log  n)  with  high  probability,  where 
Xi  is  the  *th  node  in  the  primary  neighbor  sequence  of  x.  Therefore,  the  expected  cost  of 
the  delete  operation  executed  by  x  is  bounded  by  the  product  of  £'[Eo<«<iogn/6 
and  (9(log  n).  By  Lemma  2.5.24,  it  follows  that  the  expected  cost  of  a  delete  operation  is 
(9(C'logn).  Q.E.D. 

2.5.4  Auxiliary  memory 

Proof  of  Theorem  3:  We  first  place  an  upper  bound  on  the  size  of  the  neighbor  table 
of  any  u  inV.  By  definition,  the  number  of  primary  and  secondary  neighbors  of  u  is  at 
most  (d  +  l)2'’(log  n)/b,  which  is  O(log  n).  By  Corollary  2.5.6.1,  the  number  of  reverse 
neighbors  of  u  is  C>(log^  n)  with  high  probability. 

We  next  place  an  upper  bound  on  the  size  of  the  pointer  list  of  any  u  in  V.  The  size  of 
Ptr(u)  is  at  most  the  number  of  triples  of  the  form  (A,  u,  •),  where  A  is  in  >1  and  is  in  V 
such  that  (i)  there  exists  i  in  [(log  n)/b]  such  that  v  is  an  i-leaf  of  u,  (ii)  A[j]  =  u[j]  for  all 
j  in  [?] ,  and  (iii)  A  is  in  the  main  memory  of  v. 

By  Lemma  2.5.7,  the  number  of  i-leaves  of  u  is  0(2®^  log  n)  with  high  probability.  The 
probability  that  A[j]  =  u[j],  for  all  j  in  [i],  is  at  most  1  /2®*'.  Since  the  number  of  objects  in 
the  main  memory  of  any  node  is  at  most  £,  it  follows  that  with  high  probability,  \Ptr(u)\  is 
at  most  Eie[iogn/fc]  0{£log  n)  which  is  O(£log^  n). 

Combining  the  bounds  on  the  sizes  of  the  neighbor  table  and  pointer  list,  we  obtain  that 
the  size  of  the  auxiliary  memory  of  u  is  0{£  log^  n)  with  high  probability.  Q.E.D. 

2.5.5  Adaptability 

Proof  of  Theorem  4:  By  Lemma  2.5.6,  for  any  node  u,  the  number  of  nodes  for  which 
u  is  a  primary  or  secondary  neighbor  is  O(log  n)  expected  and  0(log^  n)  with  high  prob- 
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ability.  Moreover,  m  is  a  reverse  neighbor  of  O(log  n)  nodes  since  u  has  (9(log  n)  primary 
neighbors.  Therefore,  the  adaptability  of  our  scheme  is  0(log  n)  expected  and  C)(log^  n) 
with  high  probability.  Q.E.D. 

2.6  Future  work 

We  would  like  to  extend  our  study  to  more  general  classes  of  cost  functions  and  determine 
tradeoffs  among  the  various  complexity  measures.  It  would  also  be  interesting  to  consider 
models  that  allow  faults  in  the  network.  We  beheve  that  our  access  scheme  can  be  extended 
to  perform  well  in  the  presence  of  faults,  as  the  distribution  of  control  information  in  our 
scheme  is  balanced  among  the  nodes  of  the  network. 


Chapter  3 

Fast  Algorithms  for  Finding 
0(Congestion+Dilation)  Packet  Routing 
Schedules 


3.1  Introduction 

In  this  chapter,  we  consider  the  problem  of  scheduling  the  movements  of  packets  whose 
paths  through  a  network  have  already  been  determined.  The  problem  is  formalized  as 
follows.  We  are  given  a  network  with  n  nodes  (switches)  and  m  edges  (communication 
channels).  Each  node  can  serve  as  the  source  or  destination  of  an  arbitrary  number  of 
packets  (or  cells  or  flits,  as  they  are  sometimes  referred  to).  Let  N  denote  the  total  number 
of  packets  to  be  routed.  The  goal  is  to  route  the  N  packets  from  their  origins  to  their 
destinations  via  a  series  of  synchronized  time  steps,  where  at  each  step  at  most  one  packet 
can  traverse  each  edge,  and  each  packet  can  traverse  at  most  one  edge  at  each  step.  Without 
loss  of  generality,  we  assume  that  all  edges  in  the  network  are  used  in  the  path  of  some 
packet,  and  thus  that  m  gives  the  number  of  such  edges  (all  the  other  edges  are  irrelevant 
to  our  problem). 

Figure  3.1  shows  a  5-node  network  in  which  one  packet  is  to  be  routed  to  each  node. 
The  shaded  nodes  in  the  figure  represent  switches,  and  the  edges  between  the  nodes  repre¬ 
sent  channels.  A  packet  is  depicted  as  a  square  box  containing  the  label  of  its  destination. 

This  is  joint  work  with  Tom  Leighton,  MIT,  and  Bruce  Maggs,  CMU.  This  work  appears  in  [28] . 


51 


52 


CHAPTER  3.  PACKET  ROUTING  SCHEDULES 


Figure  3.1:  A  graph  model  for  packet  routing. 

During  the  routing,  packets  wait  in  three  different  kinds  of  queues.  Before  the  routing 
begins,  packets  are  stored  at  their  origins  in  special  initial  queues.  When  a  packet  traverses 
an  edge,  it  enters  the  edge  queue  at  the  end  of  that  edge.  A  packet  can  traverse  an  edge 
only  if  at  the  beginning  of  the  step,  the  edge  queue  at  the  end  of  that  edge  is  not  full.  Upon 
traversing  the  last  edge  on  its  path,  a  packet  is  removed  from  the  edge  queue  and  placed  in  a 
special^na/  queue  at  its  destination.  In  Figure  3.1,  all  of  the  packets  reside  in  initial  queues. 
For  example,  packets  4  and  5  are  stored  in  the  initial  queue  at  node  1 .  In  this  example,  each 
edge  queue  is  empty,  but  has  the  capacity  to  hold  two  packets.  Final  queues  are  not  shown 
in  the  figure.  Independent  of  the  routing  algorithm  used,  the  sizes  of  the  initial  and  final 
queues  are  determined  by  the  particular  packet  routing  problem  to  be  solved.  Thus,  any 
bound  on  the  maximum  queue  size  required  by  a  routing  algorithm  refers  only  to  the  edge 
queues. 

We  focus  on  the  problem  of  timing  the  movements  of  the  packets  along  their  paths.  A 
schedule  for  a  set  of  packets  specifies  which  move  and  which  wait  at  each  time  step.  The 
length  of  a  schedule  is  the  number  of  time  steps  required  to  route  all  the  packets  to  their 
destinations  according  to  the  schedule.  Given  any  underlying  network,  and  any  selection 
of  paths  for  the  packets,  our  goal  is  to  produce  a  schedule  for  the  packets  that  minimizes 
the  length  of  the  schedule  and  the  maximum  queue  size  needed  to  route  all  of  the  packets 
to  their  destinations. 
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Figure  3.2:  A  set  of  paths  for  the  packets  with  dilation  d  =  3  and  congestion  c  =  3. 

Of  course,  there  is  a  strong  correlation  between  the  time  required  to  route  the  packets 
and  the  selection  of  the  paths.  In  particular,  the  maximum  distance,  d,  traveled  by  any 
packet  is  always  a  lower  bound  on  the  time.  We  call  this  distance  the  dilation  of  the  paths. 
Similarly,  the  largest  number  of  packets  that  must  traverse  a  single  edge  during  the  entire 
course  of  the  routing  is  a  lower  bound.  We  call  this  number  the  congestion,  c,  of  the  paths. 
Figure  3.2  shows  a  set  of  paths  for  the  packets  of  Figure  3.1  with  dilation  3  (since  the  path 
followed  by  the  packet  going  from  node  5  to  node  3  has  length  3)  and  congestion  3  (since 
three  paths  use  the  edge  between  nodes  1  and  2). 

3.1.1  Related  work 

Given  any  set  of  paths  with  congestion  c  and  dilation  d,  in  any  network,  it  is  straightforward 
to  route  all  of  the  packets  to  their  destinations  in  cd  steps  using  queues  of  size  c  at  each 
edge.  In  this  case  the  queues  are  big  enough  that  a  packet  can  never  be  delayed  by  a  full 
queue  m  front,  so  each  packet  can  be  delayed  at  most  c  -  1  steps  at  each  of  at  most  d  edges 
on  the  way  to  its  destination. 

In  [27],  Leighton,  Maggs,  and  Rao  showed  that  there  are  much  better  schedules.  In 
particular,  they  established  the  existence  of  a  schedule  using  0(c  +  d)  steps  and  constant- 
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size  queues  at  every  edge,  thereby  achieving  the  naive  lower  bounds  (up  to  constant  factors) 
for  any  routing  problem.  The  result  is  highly  robust  in  the  sense  that  it  works  for  any  set 
of  edge-simple  paths  and  any  underlying  network.  (A  priori,  it  would  be  easy  to  imagine 
that  there  might  be  some  set  of  paths  on  some  network  that  requires  more  than  Q,{c  d) 
steps  or  larger  than  constant-size  queues  to  route  all  the  packets.)  The  method  that  they  use 
to  show  the  existence  of  optimal  schedules,  however,  is  not  constructive.  In  other  words, 
prior  to  this  work,  the  fastest  known  algorithms  for  producing  schedules  of  length  0{c-\-d) 
with  constant-size  edge  queues  required  time  that  is  exponential  in  the  number  of  packets. 

Scheideler  recently  presented  in  [49]  an  alternative  simpler  proof  for  the  existence  of 
0{c  +  d)-step  schedules  that  only  require  edge  queues  of  size  2.  The  main  idea  in  his  proof 
is  to  decompose  the  problem  in  a  different  way  by  using  so-called  “secure  edges”. 

In  [35] ,  Meyer  auf  der  Heide  and  Scheideler  showed  the  existence  of  an  off-line  protocol 
that  only  requires  edge  queues  of  size  1.  However,  the  schedule  produced  by  this  protocol 
has  length  0{[d  +  c(log(c  -h  d))(\oglog{c  -h  d))]  logloglog<‘+')(c  -h  d)),  for  any  constant 
e  >  0. 

For  the  class  of  leveled  networks,  Leighton,  Maggs,  Ranade,  and  Rao  [25]  showed  that 
there  is  a  simple  on-hne  randomized  algorithm  for  routing  the  packets  to  their  destinations 
within  0{c  +  L  +  log  N)  steps,  with  high  probability,  where  L  is  the  number  of  levels  in  the 
network,  and  N  is  the  total  number  of  packets.  (In  a  leveled  network  with  L  levels,  each 
node  is  labeled  with  a  level  number  between  0  and  I  -  1,  and  every  edge  that  has  its  tail 
on  level  i  has  its  head  on  level  i  -|- 1,  for  0  <  i  <  I  —  1 .) 

Mansour  and  Patt-Shamir  [33]  showed  that  if  packets  are  routed  greedily  on  shortest 
paths,  then  all  of  the  packets  reach  their  destinations  within  d  +  N  steps.  These  schedules 
may  be  much  longer  than  optimal,  however,  because  N  may  be  much  larger  than  c.  Meyer 
auf  der  Heide  and  Vocking  [36]  devised  a  simple  on-line  randomized  algorithm  that  routes 
all  packets  to  their  destinations  in  0(c  </  -|-  log  A'")  steps,  with  high  probability,  provided 

that  the  paths  taken  by  the  packets  are  short-cut  free  (e.g.,  shortest  paths). 

Recently,  Rabani  and  Tardos  [42],  and  Ostrovsky  and  Rabani  [39]  extended  the  main 
ideas  used  in  [27],  and  in  the  centralized  algorithm  presented  in  this  chapter,  to  obtain 
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on-line  local  control  algorithms  for  the  general  packet  routing  problem  that  produce  near- 
optimal  schedules.  More  specifically,  Tardos  and  Rabani  [42]  show  a  randomized  on-line 
algorithm  that  with  high  probability  delivers  all  packets  in  0(c  +  d{{log*  + 

log®  A^)  steps;  Ostrovsky  and  Rabani  [39]  improved  on  this  result  by  presenting  a  ran¬ 
domized  on-line  algorithm  that  delivers  all  the  packets  to  their  destinations  in  0{c  +  d  + 
N)  steps  with  high  probability,  for  arbitrary  e  >  0, 


It  was  also  in  recent  work  that  Srinivasan  and  Teo  [53]  answered  a  long-standing  ques¬ 
tion:  Given  source  and  destination  nodes  for  each  packet,  can  we  select  the  routing  paths 
for  the  N  packets,  with  congestion  c  and  dilation  d,  in  order  to  approximate  the  minimum 
value  of  c  -f  d  (over  all  possible  choices  of  paths)  to  within  a  constant  factor  ?  (Finding  the 
minimum  value  of  c  +  </  is  NP-hard.)  They  provided  an  algorithm  that  selects  such  paths 
in  polynomial  time;  by  applying  our  algorithm  on  the  selected  paths,  Srinivasan  and  Teo 
described  the  first  off-line  constant  factor  approximation  algorithm  for  routing  N  packets 
(if  we  are  only  given  the  source  and  destination  nodes  of  each  packet)  using  constant-size 
queues.  It  is  interesting  to  note  that  there  is  still  no  polynomial-time  algorithm  known  for 
which  the  congestion  c  alone  is  asymptotically  optimal:  It  was  clever  (and  crucial)  that 
Srinivasan  and  Teo  minimized  the  sum  c-\-  d  rather  than  just  c. 


The  problem  of  scheduling  packets  through  given  paths  strongly  relates  to  network 
emulations  via  embeddings,  as  we  will  see.  Koch  et  al.  in  [24],  and  Maggs  and  Schwabe 
in  [32]  address  the  problem  of  performing  network  emulations  via  embeddings. 


Shmoys,  Stein,  and  Wein  [51]  give  randomized  and  deterministic  algorithms  that  pro¬ 
duce  schedules  of  length  within  a  polylogarithmic  factor  of  that  of  an  optimal  schedule, 
for  job-shop  scheduling  when  jobs  are  not  assumed  to  have  unit  length  and  a  machine  may 
have  to  work  on  the  same  job  more  than  once.  Our  results  can  be  used  to  find  schedules 
of  length  within  a  constant  factor  of  the  optimal  schedule  for  the  less  general  job-shop 
problem  when  jobs  have  unit  length  and  a  machine  can  work  at  most  once  on  any  job,  as 
discussed  below. 
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3.1.2  Our  results 

In  this  chapter,  we  show  how  to  produce  schedules  of  length  0{c  +  d)  in  0(m(c  + 
</)(logP)'*(loglogT))  time  steps,  with  probabihty  at  least  1  -  1/V^,  for  any  constant 
^  >  0,  where  m  is  the  number  of  distinct  edges  traversed  by  some  packet  in  the  net¬ 
work.  The  schedules  can  also  be  found  in  polylogarithmic  time  on  a  parallel  computer 
using  0{iv[r  +  </)(l()gT’)‘’(Ioglog'P))  work,  with  probability  at  least  1  —  l/V^ . 

The  algorithm  for  producing  the  schedules  is  based  on  an  algorithmic  form  of  the 
Lov^sz  Local  Lemma  (see  [12]  or  [52,  pp.  57-58])  discovered  by  Beck  [8].  Showing  how 
to  modify  Beck’s  arguments  so  that  they  can  be  appUed  to  scheduling  problems  is  the  main 
contribution  of  our  work.  Once  this  is  done,  the  construction  of  asymptotically  optimal 
routing  schedules  is  accomplished  using  the  methods  of  [27]. 

The  result  has  several  applications.  For  example,  if  a  particular  routing  problem  is 
to  be  performed  many  times  over,  then  it  may  be  feasible  to  compute  an  asymptotically 
optimal  schedule  once  using  global  control.  This  situation  arises  in  network  emulation 
problems  (see  Chapter  1).  Suppose  a  network  G  is  being  emulated  by  a  host  network  H  by 
embedding  G  into  H.  The  algorithm  described  in  this  chapter  can  be  used  to  produce  a 
schedule  in  which  the  packets  are  routed  to  their  destinations  in  0{c  +  d)  steps.  Thus,  H 
can  emulate  each  step  of  G  in  0{l  -|-  c  -f  cf)  steps,  where  /  is  the  load  of  this  embedding 
(i.e.,  the  maximum  number  of  nodes  of  G  that  are  mapped  to  a  node  of  H). 

The  result  also  has  applications  to  job-shop  scheduling.  In  particular,  consider  a  schedul¬ 
ing  problem  with  jobs  ii, . . . ,  jr,  and  machines  ???] , . . . ,  m^,  for  which  each  job  must  be 
performed  on  a  specified  sequence  of  machines.  In  this  application,  we  assume  that  each 
job  occupies  each  machine  that  works  on  it  for  a  unit  of  time,  and  that  no  machine  has 
to  work  on  any  job  more  than  once.  Of  course,  the  jobs  correspond  to  packets,  and  the 
machines  correspond  to  edges  in  the  packet  routing  problem.  Hence,  we  can  define  the 
dilation  of  the  scheduling  problem  to  be  the  maximum  number  of  machines  that  must  work 
on  any  job,  and  the  congestion  to  be  the  maximum  number  of  jobs  that  have  to  be  run  on 
any  machine.  As  a  consequence  of  the  packet  routing  result,  we  know  that  any  scheduling 
problem  can  be  solved  in  0{c  +  d)  steps.  In  addition,  we  know  that  there  is  a  schedule  for 
which  each  job  waits  at  most  0{c  +  d)  steps  before  it  starts  running,  and  that  each  job  waits 
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at  most  a  constant  number  of  steps  in  between  consecutive  machines.  The  queue  of  jobs 
waiting  for  any  machine  will  also  always  be  at  most  a  constant. 

3.1.3  Outline 

The  remainder  of  the  chapter  is  divided  into  sections  as  follows.  In  Section  3.2,  we  give 
a  very  brief  overview  of  the  non-constructive  proof  in  [27].  We  also  introduce  some  defi¬ 
nitions,  and  present  two  important  propositions  and  a  new  lemma  that  will  be  of  later  use. 
In  Section  3.3,  we  describe  how  to  make  the  non-constructive  method  in  [27]  constructive. 
In  Section  3.4,  we  analyze  the  running  time  of  the  algorithm.  The  propositions  presented 
in  Sections  3.2  and  3.3  are  meant  to  replace  (and  are  numbered  according  to)  some  of  the 
lemmas  in  [27]. 

In  Section  3.5,  we  show  how  to  parallelize  the  scheduling  algorithm.  We  conclude  with 
some  remarks  in  Section  3.6. 

3.2  Preliminaries 

In  [27],  Leighton,  Maggs,  and  Rao  proved  that  for  any  set  of  packets  whose  paths  are  edge- 
simple  and  have  congestion  c  and  dilation  d,  there  is  a  schedule  of  length  0{c-\-d)m  which 
at  most  one  packet  traverses  each  edge  of  the  network  at  each  step,  and  at  most  a  constant 
number  of  packets  wait  in  each  queue  at  each  step.  (An  edge-simple  path  uses  no  edge 
more  than  once.)  Note  that  there  are  no  restrictions  on  the  size,  topology,  or  degree  of  the 
network  or  on  the  number  of  packets. 

The  strategy  for  constructing  an  efficient  schedule  is  to  make  a  succession  of  refine¬ 
ments  to  the  “greedy”  schedule,  Sq,  in  which  each  packet  moves  at  every  step  until  it 
reaches  its  final  destination.  This  initial  schedule  is  as  short  as  possible:  Its  length  is  only 
d.  Unfortunately,  as  many  as  c  packets  may  traverse  an  edge  at  a  single  time  step  in  Sq, 
whereas  in  the  final  schedule  at  most  one  packet  is  allowed  to  traverse  an  edge  at  each  step. 
Each  refinement  will  bring  us  closer  to  meeting  this  requirement. 

The  proof  uses  the  Lovasz  Local  Lemma  ([12]  or  [52,  pp.  57-58])  at  each  refinement 
step.  Given  a  set  of  “bad”  events  in  a  probability  space,  the  lemma  provides  a  simple 
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inequality  that,  when  satisfied,  guarantees  that  with  probability  greater  than  zero,  no  bad 
event  occurs.  The  inequality  relates  the  probability  that  each  bad  event  occurs  with  the 
dependence  among  them.  A  set  of  events  Ai,  A2, . . . ,  A,  in  a  probability  space  has  depen¬ 
dence  at  most  h  if  every  event  is  mutually  independent  of  some  set  of  9  -  6  - 1  other  events. 
The  lemma  is  non-constructive;  for  a  discrete  probability  space,  it  shows  only  that  there 
exists  some  elementary  outcome  that  is  not  in  any  bad  event. 

Lemma  [Lovasz]  Let  A\,A2,. ..  ,Aq  be  a  set  of  "bad"  events,  each  occurring  with  proba¬ 
bility  p,  and  with  dependence  at  most  b.  IfApb  <  1,  then  with  probability  greater  than  zero, 
no  bad  event  occurs.  Q.E.D. 

Before  proceeding,  we  need  to  introduce  some  notation.  A  T-frame  —  or  z.  frame  of 
size  T  —  is  a  sequence  of  T  consecutive  time  steps.  The  congestion  of  an  edge  5^  in  a 
T-frame  is  the  number  of  packets  that  traverse  g  in  this  frame;  the  relative  congestion  of 
edge  in  a  T-frame  is  given  by  the  congestion  of  g  in  the  frame  divided  by  the  frame  size 
T.  Tht  frame  congestion  in  a  T -frame  is  the  largest  congestion  of  an  edge  in  the  frame;  the 
relative  congestion  in  a  T-frame  is  the  largest  relative  congestion  of  an  edge  in  the  fi’ame. 

3.2.1  A  pair  of  tools  for  later  use 

In  this  section  we  re-state  Lemma  3.5  of  [27]  and  we  prove  Proposition  3.6,  which  replaces 
Lemma  3.6  of  [27].  Both  will  be  used  in  the  proofs  in  Section  3.3. 

Lemma  3.5  [27]  In  any  schedule,  if  the  number  of  packets  that  traverse  a  particular  edge 
g  in  any  y-frame  is  at  most  Ry,  for  all  y  between  T  and  2T  -  1,  then  the  number  of  packets 
that  traverse  g  in  any  y-frame  is  at  most  Ry,  for  all  y>T. 

Proof:  Consider  a  frame  r  of  size  T',  where  T'  >  2T  -  1.  The  first  ([T'/TJ  -  1)T 
steps  of  the  frame  can  be  broken  into  T-frames.  In  each  of  these  T-frames,  at  most  RT 
packets  traverse  g.  The  remainder  of  the  T'-frame  r  consists  of  a  single  y-frame,  where 
T  <  y  <  2T  —  1,  in  which  at  most  Ry  packets  traverse  g.  Q.E.D. 
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The  following  proposition  is  basically  a  re-statement  of  Lemma  3.6  of  [27]  and  will 
be  used  here  in  place  of  this  lemma.  Proposition  3.6  applies  when  the  number  of  distinct 
edges  traversed  by  the  packets  in  the  schedule  considered,  m',  is  at  most  a  polynomial  in  / 
(/  as  defined  below). 

Proposition  3.6  Suppose  that  (i)  there  are  positive  constants  oi,  0:2,  where  oi  >  02. 
Ofi  >  2o;3,  and  03  >  02/  (H)  I  is  larger  than  some  sufficiently  large  constant;  and  (Hi) 
for  all  edge  g,  in  a  schedule  of  length  /“*  (or  smaller)  the  relative  congestion  of  edge  g  in 
frames  of  size  or  larger  is  at  most  pg,  where  pg  is  a  constant.  Let  m'  be  the  number 
of  distinct  edges  traversed  by  the  packets  in  this  schedule.  Furthermore,  suppose  that  each 
packet  is  assigned  a  delay  chosen  randomly,  independently,  and  uniformly  from  the  range 
[0, 7"^],  and  that  if  a  packet  is  assigned  a  delay  ofx,  then  x  delays  are  inserted  in  the  first 
7"®  steps  and  7"^  —  x  delays  are  inserted  in  the  last  7"®  steps  of  the  schedule. 

1.  Then  for  any  constant  ^  >  0,  there  exists  a  constant  ki  >  Q  such  that  with  probability 
at  least  1  —  m'  j  the  relative  congestion  of  any  edge  g  in  any  frame  of  size  log^  7  or 
larger,  in  between  the  first  and  last  7"®  steps  of  the  new  schedule  is  at  most  Pg{l-\-  a), 
fora  =  kily/\ogI. 

2.  We  can  find  such  a  schedule  and  verify  whether  it  satisfies  the  conditions  in  1.  in 
O(m'7"'(log^  7))  time. 


Proof:  To  bound  the  relative  congestions  of  each  edge  in  frames  of  size  log^  7  or  larger, 
we  need  to  consider  all  m'  edges  and,  by  Lemma  3.5,  all  frames  of  size  between  log^  7  and 
21og2  7-l. 

As  we  shall  see,  the  number  of  packets  that  traverse  an  edge  g  during  a  particular  T- 
frame  r  has  a  binomial  distribution.  In  the  new  schedule,  a  packet  can  traverse  g  during  r 
only  if  in  the  original  schedule  it  traversed  g  during  r  or  during  one  of  the  7“^  steps  before 
the  start  of  r.  Since  the  relative  congestion  of  edge  g  in  any  frame  of  size  7"^  or  greater 
in  the  original  schedule  is  at  most  pg,  there  are  at  most  pg{I‘^^  +  T)  such  packets.  The 
probability  that  an  individual  packet  that  could  traverse  g  during  r  actually  does  so  is  at 
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most  T//"2.  Thus,  the  probability  p  that  p'^  or  more  packets  traverse  an  edge  g  during  a 
particular  T -frame  r  is  at  most 

pM^T)  ^  ^  ^  p^(I'^2+T)-k 

To  estimate  the  area  under  the  tails  of  this  binomial  distribution,  we  use  the  following 
Chemoff-type  bound  [10].  Suppose  that  there  are  .r  independent  Bernoulli  trials,  each  of 
which  is  successful  with  probability  p'.  Let  S  denote  the  number  of  successes  in  the  x 
trials,  and  let  p  =  E[S]  =  xp'.  Following  Angluin  and  Valiant  [2],  we  have 

Pr[S  >  (1  +  -f)p]  < 

for  0  <  *)  <  1 .  (Note  that  e  denotes  the  base  of  the  natural  logarithm.) 

In  our  application,  x  =  pg{I°‘‘^  -\-  T),  p'  =  T/P\  and  p  =  Pgil°'^  +  T)T/I^^.  For  7  = 
log  /  (where  ko  is  a  positive  constant  to  be  specified  later),  Pg  >  1,  and  T  >  log^  I, 

we  have  Pr[,5'  >  (1  + 'y)p]  <  g-/>'op<»(/“2+T)T/(/°2iog/)  ^  g-A-oiog/  ^  ^-ko\ni  _  ijjko^ 

Set  ////'  to  be  (1  +  '))p  =  (1  -|-  ^3^-o/log  I)pg{P^  -1-  T)TfI^^.  For  7  large  enough, 
2 log*  ///“=  <  1  / Vlog E  and  thus  p'g  <  Pgil  +  A’i/\/log  I),  for  some  constant  ki  > 
3/mi  -h  \/3A^  +  1.  Let  a  =  ki/y/\ogI.  Then  p^  <  pg{l  +  a).  Thus  p  =  Pr[5  >  PgT]  < 
Pr[.s'  >  (I +^)p]  <  1/7^'°. 

Since  there  are  at  most  (7®’  -|-  7"^)  <  27"’  starting  points  for  a  frame,  and  log^  7 
different  size  frames  starting  at  each  point,  and  there  are  at  most  m'  distinct  edges  per 
frame,  the  probability  that  the  relative  congestion  of  any  edge  g  is  more  than  Pg  in  any 
frame  is  at  most  jn'7"‘  log^  <  m' f  (since  log^  7  <  P).  For  any  ^  >  0,  we 

set  ko  —  ^  +  ai  2,  which  in  turn  sets  ki  and  a,  completes  the  proof  of  part  1. 

We  assign  a  random  delay  to  each  packet,  and  verify  whether  the  conditions  in  1.  apply 
as  follows.  We  construct  the  schedule  by  routing  all  the  packets  one  step  at  a  time.  At  time 
/ ,  for  1  <  t  <  ( 7"  ’  7"2 ) ,  we  compute  the  congestion  of  edge  5'  in  a  T -frame  ending  at  t ,  for 
all  edges  g  that  are  traversed  by  some  packet  in  the  schedule,  for  all  T  €  [log^  7 , 2  log^  7 — 1  ] 
using  the  following  rules:  (i)  if  T  =  log^  7  then  the  congestion  of  5^  in  a  T -frame  ending 
at  time  t  can  be  computed  by  taking  the  congestion  of  g  in  the  T-frame  ending  at  /  —  1 , 
subtracting  the  number  of  packets  that  traverse  edge  g  at  time  t  —  T  and  adding  the  number 
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of  packets  that  traverse  g  at  time  t;  otherwise  (ii)  if  t  >  T,  then  the  congestion  of  edge  g 
in  a  T-frame  that  ends  at  t  is  given  by  the  congestion  of  5-  in  a  (T  -  l)-frame  that  ends  at 
t  —  1  plus  the  number  of  packets  that  traverse  edge  g  at  time  t.  The  relative  congestion  of 
an  edge  in  a  frame  is  given  by  the  congestion  of  the  edge  in  the  frame  divided  by  the  size  of 
the  frame.  This  can  be  done  in  time  log^  /)  =  log^  /),  m'  being 

the  number  of  distinct  edges  traversed  in  this  schedule.  Q.E.D. 

In  the  remainder  of  this  chapter,  for  a  schedule  of  size  polynomial  in  /,  we  assume 
we  check  for  the  congestions  of  all  T-frames,  log^  I  <  T  <  2  log^  I,  of  the  schedule  as 
described  in  the  proof  of  Proposition  3.6. 


3.3  An  algorithm  for  constructing  optimal  schedules 

In  this  section,  we  describe  the  key  ideas  required  to  make  the  non-constructive  proof 
of  [27]  constructive.  There  are  many  details  in  that  proof,  but  changes  are  required  only 
where  the  Lovasz  Local  Lemma  is  used,  in  Lemmas  3.2,  3.7,  and  3.9  of  [27].  The  non¬ 
constructive  proof  showed  that  a  schedule  can  be  modified  by  assigning  delays  to  the  pack¬ 
ets  in  such  a  way  that  in  the  new  schedule  the  relative  congestion  can  be  bounded  in  much 
smaller  frames  than  in  the  old  schedule.  In  this  chapter,  we  show  how  to  find  the  assign¬ 
ment  of  delays  quickly.  We  will  not  regurgitate  the  entire  proof  of  [27],  but  only  reprove 
those  lemmas,  trying  to  state  the  replacement  propositions  in  a  way  as  close  as  possible  to 
the  original  lemmas. 

In  Section  3.3.1,  we  provide  a  proposition.  Proposition  3.2,  that  is  a  constructive  version 
of  Lemma  3.2  of  [27].  In  Sections  3.3.2  and  3.3.3,  we  provide  three  propositions  that  are 
meant  to  replace  Lemma  3.7  of  [27].  Lemma  3.7  is  applied  O(log*(c  -1-  d))  times  in  [27]. 
We  will  use  Propositions  3.7.1  and  3.7.2  to  replace  the  first  two  applications  of  Lemma  3.7. 
The  remaining  applications  wiU  be  replaced  by  Proposition  3.7.3.  In  Section  3.3.4,  we 
present  the  three  replacement  propositions  for  Lemma  3.9  of  [27].  Our  belief  is  that  a  reader 
who  understands  the  structure  of  the  proof  in  [27]  and  the  propositions  in  this  chapter  can 
easily  see  how  to  make  the  original  proof  constructive.  We  analyze  the  running  time  of  our 
algorithm  in  Section  3.4. 
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3.3.1  The  first  reduction  in  frame  size 

For  a  given  set  of  N  packets,  let  c  and  d  denote  the  congestion  and  dilation  of  the  paths 
taken  by  these  packets,  and  let  V  denote  the  sum  of  the  lengths  of  these  paths.  Note  that 
m  <  V  <  me,  and  that  c,d  <  V,  where  m  is  the  number  of  edges  traversed  by  some 
packet  in  the  network.  The  following  proposition  is  meant  to  replace  Lemma  3.2  of  [27], 
It  is  used  just  once  in  the  proof,  to  reduce  the  frame  size  from  d  to  log  V. 

Proposition  3.2  For  any  constant  /?  >  0,  there  is  a  constant  a  >  0,  such  that  there  exists  an 
algorithm  that  constructs  a  schedule  of  length  d  -f  oc  in  which  packets  never  wait  in  edge 
queues  and  in  which  the  relative  congestion  in  any  frame  of  size  log  V  or  larger  is  at  most 
1.  The  algorithm  runs  in  0{m{c  +  d){\og  V))  time  steps,  and  succeeds  with  probability  at 
least  1  —  1/V^. 

Proof:  The  algorithm  is  simple:  Assign  each  packet  an  initial  delay  that  is  chosen  ran¬ 
domly,  independently,  and  uniformly  from  the  range  [0,  ac],  where  o  is  a  constant  that  will 
be  specified  later;  the  packet  will  wait  out  its  initial  delay  and  then  travel  to  its  destination 
without  stopping.  The  length  of  the  new  schedule  is  at  most  ac  -t-  d. 

To  bound  the  relative  congestion  in  frames  of  size  log  V  or  larger,  we  need  to  consider 
all  m  edges  and,  by  Lemma  3.5,  all  frames  of  size  between  log  V  and  2  log  V  —  1.  We 
assume  without  loss  of  generality  that  c  >  2  log  "P,  since  any  frame  of  length  c  or  larger 
has  relative  congestion  at  most  1.  For  any  particular  edge  g,  and  T-frame  r,  where  log  V  < 
T  <  2  log  P  —  1,  the  probability  p  that  more  than  T  packets  use  in  r  is  at  most 


since  each  of  the  at  most  c  packets  that  pass  through  g  has  probability  at  most  T/ac  of 
using  g  in  r,  and  since  <  {ae/bf,  for  any  0  <  b  <  a.  The  total  number  of  frames  to 
consider  is  at  most  (ac  -I-  d)  log  P,  since  there  are  at  most  ac  -f  d  places  for  a  frame  to  start, 
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and  log  V  frame  sizes.  Thus  the  probability  that  the  relative  congestion  of  any  edge  is  too 
large  in  any  frame  of  size  log  V  or  larger  is  at  most 

/  e  \  logT’ 

m(log  'P)(ac  +  d)  i—j 

We  bound  the  probability  above  by  summing  the  probabilities  that  the  relative  congestion 
is  too  large  for  all  m  log  "P  edge-frame  pairs.  Using  the  inequalities  V  >  c,V  >  m,  and 
V  >  d,  we  have  that  for  any  constant  (3  >  0,  there  exists  a  constant  ct  >  0,  such  that  this 
probability  is  at  most  l/V^. 

Assigning  the  delays  to  the  packets  takes  0{N)  time  steps.  Verifying  whether  the 
relative  congestion  is  at  most  1  in  any  T-frame  of  size  log  P  <  T  <  2  log  P  —  1  can  be  done 
in  0(m(c  -|-  c?)(log  P))  time  steps  (see  the  last  paragraph  of  the  proof  of  Proposition  3.6). 

Q.E.D. 

Before  applying  Proposition  3.7.1,  we  first  apply  Proposition  3.2  to  produce  a  schedule 
Si  of  length  0{c+d)  in  which  the  relative  congestion  in  any  frame  of  size  log  P  or  larger  is 
at  most  1.  For  any  positive  constant  /?.,  this  step  succeeds  with  probability  at  least  1  —  1  /P^. 
If  it  fails,  we  simply  try  again. 

3.3.2  A  randomized  algorithm  to  reduce  the  frame  size 

In  this  section,  we  prove  two  very  similar  propositions.  Propositions  3.7.1  and  3.7.2,  that  are 
meant  to  replace  the  first  two  applications  of  Lemma  3.7  of  [27],  which  we  state  below.  In 
proving  all  these  propositions,  we  use  a  constractive  version  of  the  Lovdsz  Local  Lenrnia 
that  applies  to  scheduling  problems.  Let  I  >  0.  We  break  a  schedule  S  into  blocks  of 
2/^  -\-2P  -  I  consecutive  time  steps.  The  size  of  a  block  is  the  number  of  time  steps  it 
spans. 

Lemma  3.7  [27]  In  a  block  of  size  -|-  2P  —  /,  let  the  relative  congestion  in  any  frame 
of  size  I  or  greater  be  at  most  r,  where  I  <r  <  I.  Then  there  is  a  way  of  assigning  delays 
to  the  packets  so  that  in  between  the  first  and  the  last  P  steps  of  this  block,  the  relative 
congestion  in  any  frame  of  size  L  =  log^  I  or  greater  is  at  most  ri  =  r(l  -I-  ei),  where 

ti  =  0(1)/^/!^. 
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After  applying  Proposition  3.2  to  reduce  the  frame  size  from  d  to  logV,  Proposi¬ 
tions  3.7.1  and  3.7.2  are  used  once  on  each  block  (for  /  =  logT’  and  I  =  (log  log 
respectively)  to  further  reduce  the  frame  size.  UnUke  Lemma  3.7  of  [27],  Propositions  3.7.1 
and  3.7.2  may  increase  the  relative  congestion  by  a  constant  factor.  In  general,  we  cannot 
afford  to  pay  a  constant  factor  at  each  of  the  (9(log*(c  -|-  d))  applications  of  Lemma  3.7  of 
[27],  but  we  can  afford  to  pay  it  twice. 

These  two  propositions  avoid  the  use  of  exhaustive  search,  since  they  relate  to  problem 
sizes  that  are  still  large:  In  these  propositions  we  designed  separate  algorithms  that  use  the 
fact  that  the  problem  sizes  are  sufficiently  large  that  we  can  guarantee  a  “good”  solution 
with  high  probability.  In  the  remainder  of  this  chapter,  we  assume  without  loss  of  generality 
that  V  is  not  a  constant.  If  P  =  0(1),  then  we  can  find  an  optimal  schedule  in  a  constant 
number  time  steps  by  using  exhaustive  search.  For  the  application  of  Proposition  3.7.1, 
7  =  logP  and  r  =  1.  With  probability  at  least  1  —  1/P^,  for  any  constant  0  >  0,  we 
succeed  in  producing  a  schedule  S2  in  which  the  relative  congestion  is  0(1)  in  frames  of 
size  log^  I  =  (log  log  Vy  or  greater  (if  we  should  fail,  we  simply  try  again).  In  the  appli¬ 
cation  of  Proposition  3.7.2, 1  =  (log  log  P)^,  and  r  =  0(1).  In  the  resulting  schedule,  S3, 
the  relative  congestion  is  0(1)  in  frames  of  size  log^((loglogP)^)  =  (logloglogp)®^^* 
or  greater,  with  probability  at  least  1  -  1/P^,  for  any  constant  /?  >  0.  At  this  point,  the 
problem  sizes  are  small  enough  for  using  exhaustive  search,  and  we  start  using  Proposi¬ 
tion  3.7.3. 

Proposition  3.7.1  Let  the  relative  congestion  in  any  frame  of  size  I  or  greater  be  at  most  r 
in  a  block  of  size  2P  +  2P  -  I,  where  I  <  r  <  I  and  I  =  log  P.  Let  Q  be  the  sum  of  the 
lengths  of  the  paths  taken  by  the  packets  in  this  block.  Then,  for  any  constant  /?  >  0 

7.  there  is  an  algorithm  for  assigning  initial  delays  in  the  range  [0,  7]  to  the  packets 
so  that  in  between  the  first  and  last  P  steps  of  the  block,  the  relative  congestion  in 
any  frame  of  size  log^  7  or  greater  is  at  most  r',  where  r'  =  2r(l  -|-  a)  and  a  = 
0{\)l^X^P, 

2.  this  algorithm  runs  in  O(0(log  P)'*(log  log  P))  time  steps,  with  probability  at  least 
1  -  1/P^. 
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Proof:  We  define  the  bad  event  for  each  edge  g  in  the  network  and  each  T-frame  r,  for 
log^  I  <T  <2  log^  /  —  1,  as  the  event  that  more  than  r'T  packets  use  5'  in  r.  A  particular 
bad  event  may  or  may  not  occur  —  i.e.,  may  or  may  not  be  true  —  in  a  given  schedule. 
If  no  bad  event  occurs,  then  by  Lemma  3.5,  the  relative  congestion  in  aU  frames  of  size 
log^  /  or  greater  will  be  at  most  r'.  Since  there  are  log^  I  different  frame  sizes  and  there 
are  at  most  (2/^  +  2P  —  /)  +  /  =  2P  +  2P  different  frames  of  any  particular  size,  the 
total  number  of  bad  events  involving  any  one  edge  is  at  most  (2/^  +  2P)  log^  /  <  L,  for  / 
greater  than  some  large  enough  constant. 

We  now  describe  the  algorithm  for  finding  the  assignment.  In  a  first  pass  of  assigning 
delays  to  the  packets,  we  process  the  packets  one  at  a  time.  To  each  packet,  we  assign  a 
delay  chosen  randomly,  independently,  and  uniformly  from  0  to  /.  We  then  examine  every 
event  in  which  the  packet  participates. 

A  packet  can  use  an  edge  5^  in  a  T -frame  r  only  if  it  traversed  edge  5^  in  r  or  in  one  of 
the  1  steps  preceding  r  in  the  original  schedule.  At  most  r{T  -f- 1)  packets  use  edge  gmdi 
frame  of  size  (/  -|-  T),  since  the  relative  congestion  in  this  frame  is  at  most  r  in  the  original 
schedule.  Thus  at  most  r{T  -|- 1)  packets  can  traverse  edge  g  in  the  new  schedule  (after 
delays  are  assigned  to  the  packets).  We  call  these  r{I  +  T)  packets  the  candidate  packets 
to  use  edge  g  in  r.  Let  Cg  be  the  number  of  candidate  packets  to  use  ginr  that  have  been 
assigned  delays  so  far.  We  say  that  the  event  for  an  edge  g  and  a  T-frame  r  is  critical  if 
more  than  CgT /I  +  kr{I +T)TI {Iy/\og  /)  packets  actually  traverse  edge  g  in  r,  where  k  is 
a  positive  constant  to  be  specified  later.  Intuitively,  an  event  becomes  critical  if  the  number 
of  packets  assigned  delays  so  far  that  traverse  edge  g  'va.T  exceeds  the  expected  number  of 
such  packets  (CgT! I)  by  an  excess  term  kr{I T)T l{Iy/\ogI).  Since  Cg  <  r{T  + 1),  we 
allow  an  excess  of  at  most  k/y/\og  I  times  the  expected  number  of  packets  that  would  use 
edge  g  in  the  frame  if  all  of  the  packets  were  assigned  delays.  Hence,  the  maximum  final 
excess  allowed  does  not  depend  on  Cg.  If  a  packet  causes  an  event  to  become  critical,  then 
we  set  aside  aU  of  the  other  packets  that  could  also  use  g  during  r,  but  whose  delays  have 
not  yet  been  assigned. 

The  main  difference  between  our  algorithm  and  Beck’s  [8]  constructive  version  of  the 
Lovdsz  Local  Lemma  is  that  we  never  allow  the  number  of  packets  passing  through  an  edge 
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in  a  T-frame  to  deviate  from  the  expectation  by  more  than  a  low  order  term.  In  particular, 
we  do  not  allow  this  number  to  deviate  by  a  constant  factor  times  the  expectation.  In  Beck’s 
algorithm,  the  random  variable  associated  with  a  bad  event  (in  our  case,  the  number  of 
packets  that  traverse  an  edge  in  a  T -frame)  may  deviate  from  the  expectation  by  a  constant 
factor  times  the  expectation. 

We  will  deal  with  the  packets  that  have  been  set  aside  later.  Let  P  denote  the  set 
of  packets  that  have  been  assigned  delays.  As  we  shall  see,  after  one  pass  of  assigning 
random  delays  to  the  packets,  the  problem  of  scheduling  the  packets  that  have  been  set 
aside  is  broken  into  a  collection  of  much  smaller  subproblems,  with  probability  at  least 
1  —  1/V^  ,  for  any  constant  jS'  >  0.  Once  the  size  of  a  subproblem  (given  by  the  number 
of  edges  involved  in  the  subproblem)  gets  small  compared  to  the  frame  length,  we  can  try 
assigning  random  delays  to  the  packets  that  were  set  aside. 

In  this  initial  pass,  we  assign  a  random  delay  to  each  packet,  and  check  whether  the 
event  for  an  edge  g  traversed  by  the  packet  in  this  block  and  T -freune  r  becomes  crit¬ 
ical,  for  all  edges  g  traversed  by  the  packet  in  this  block  and  for  all  frames  of  size  T 
in  [log^  /,21og^  I  —  1],  by  following  the  same  procedure  as  described  in  the  last  para¬ 
graph  of  the  proof  of  Proposition  3.6.  Here  the  schedule  length  after  we  insert  the  delays 
is  2{P  -|-  P)  =  O(log^'P)  (and  so  there  are  C>(log^P)  starting  points  for  a  T'-frame 
r)  and  there  are  log^  /  =  (log  log  V)^  different  frame  sizes  to  consider.  The  sum  of 
the  lengths  of  the  paths  traversed  by  the  packets  in  this  block  is  Q.  Thus,  a  pass  takes 
0(Q(log  'P)^(log  log  VY)  time  steps.  If  a  pass  fails  to  reduce  the  component  size,  we  try 
again. 

In  order  to  proceed,  we  must  introduce  some  notation.  The  dependence  graph,  G,  is  the 
graph  in  which  there  is  a  node  for  each  bad  event,  and  an  edge  between  two  nodes  if  the 
corresponding  events  share  a  packet.  Let  q  denote  the  number  of  distinct  edges  traversed 
by  the  packets  in  this  block.  Let  h  denote  the  degree  of  G.  Whether  or  not  a  bad  event 
for  an  edge  g  and  a  time  frame  r  occurs  depends  solely  on  the  assignment  of  delays  to  the 
packets  that  pass  through  g.  Thus,  the  bad  event  for  an  edge  g  and  a  time  frame  r,  and  the 
bad  event  for  an  edge  g'  and  a  time  frame  r'  are  dependent  only  if  g  and  g'  share  a  packet. 
Since  at  most  r{2P  -f-  2P  —  I)  <  rP  (for  I  large  enough)  packets  pass  through  g,  each  of 
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these  packets  passes  through  at  most  2P  +  2P  —  I  <  P  other  edges  g',  and  there  are  at 
most  {2P  +  2/^)(log^  I)  <  P  time  frames  r'  the  dependence  b  is  at  most  rP^.  For  r  <  I, 
we  have  b  <  P^.  Since  the  packets  use  at  most  q  different  edges  in  the  network,  and  for 
each  edge  there  are  at  most  P  bad  events,  the  total  number  of  nodes  in  G  is  at  most  qP. 
We  say  that  a  node  in  G  is  critical  if  the  corresponding  event  is  critical.  We  say  that  a  node 
is  endangered  if  its  event  shares  a  packet  with  an  event  that  is  critical.  After  each  packet 
has  been  either  assigned  a  delay  or  set  aside,  let  Gi  denote  the  subgraph  of  G  consisting  of 
the  critical  and  endangered  nodes  and  the  edges  between  them.  If  a  node  is  not  in  Gi,  then 
all  of  the  packets  that  use  the  corresponding  edge  have  already  been  assigned  a  delay,  and 
the  bad  event  represented  by  that  node  cannot  occur,  no  matter  how  we  assign  delays  to  the 
packets  not  in  P.  Hence,  from  here  on  we  need  only  consider  the  nodes  in  Gi . 

Since  different  components  are  not  connected  by  edges  in  Gi ,  no  two  components  share 
a  packet.  Also,  any  two  events  that  involve  edges  traversed  by  the  same  packet  share  an 
edge  in  Gi ,  and  so  are  in  the  same  connected  component.  Thus  there  exists  a  one-to-one 
correspondence  between  components  of  Gi  and  disjoint  sets  in  a  partition  of  the  packets 
not  in  P.  Hence,  we  can  assign  the  delays  to  the  packets  in  each  component  separately. 

In  the  following  claim,  we  show  that,  with  high  probability,  the  size  of  the  largest 
connected  component  U  of  Gi  is  at  most  P'^  log  V,  with  high  probability.  Hence  we  have 
reduced  the  maximum  possible  size  of  a  component  from  qP  in  G  to  P^  log  "P  in  Gi . 
Recall  that  the  constant  k  is  used  to  determine  whether  and  event  becomes  critical. 

Claim  1  For  any  constant  (3'  >  0,  there  exists  a  constant  A;  >  0  such  that  the  size  of  the 
largest  connected  component  ofGi  is  at  most  P"^  log  V  with  probability  at  least  1  —  1  . 

Proof:  The  trick  to  bounding  the  size  of  a  largest  connected  component  U  is  to  observe 
that  the  subgraph  of  critical  nodes  in  U  is  connected  in  the  cube,  Gj,  of  the  graph  Gi,  i.e., 
the  graph  in  which  there  is  an  edge  between  two  distinct  nodes  u  and  v  if  in  Gi  there  is  a 
path  of  length  at  most  3  between  u  and  v.  In  Gf ,  the  critical  nodes  of  U  form  a  connected 
subgraph  because  any  path  «,  ei,  62, 63,  v  mGi  that  connects  two  critical  or  endangered 
nodes  u  and  v  by  passing  through  three  consecutive  endangered  nodes  ei,  62,  63  can  be 
replaced  by  two  paths  u,  ei,  62,  w  and  w,  62, 63,  v  of  length  3  that  each  pass  through  62*8 
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critical  neighbor  w.  Let  G2  denote  the  subgraph  of  consisting  only  of  the  critical  nodes 
and  the  edges  between  them.  Note  that  the  degree  of  G2  is  at  most  P,  and  if  two  critical 
nodes  lie  in  the  same  connected  component  in  G2,  then  they  also  lie  in  the  same  connected 
component  in  6'p  and  hence  in  Gi . 

By  a  similar  argument,  any  maximal  independent  set  of  nodes  in  a  connected  compo¬ 
nent  of  C12  is  connected  in  G^-  Note  that  if  a  set  of  nodes  is  independent  in  G2,  then  it 
is  also  independent  in  fif  and  in  Gj.  The  nodes  in  an  independent  set  in  Gi  do  not  share 
any  packets,  therefore  the  probabilities  that  each  of  these  nodes  becomes  critical  are  inde¬ 
pendent.  Let  G  ..  be  the  subgraph  of  Gl  induced  by  the  nodes  in  a  maximal  independent 
set  in  G2  (any  such  maximal  independent  set  in  G'2  will  do).  The  nodes  in  G3  form  an 
independent  set  of  critical  nodes  in  Gi .  The  degree  of  G3  is  at  most  b^. 

Our  goal  now  is  to  show  that,  for  any  constant  /?'  >  0,  there  exists  a  constant  A;  >  0 
such  that  the  number  of  nodes  in  any  connected  component  W  of  G'3  is  at  most  log  V, 
with  probability  1  —  IjV^' .  To  begin,  with  every  connected  component  W  of  Gz,  we 
associate  a  spanning  tree  of  W  (any  such  tree  will  do).  Note  that,  if  W  and  W'  are  two 
distinct  connected  components  of  G3,  then  the  spanning  trees  associated  with  W  and  W 
are  disjoint. 

Now  let  us  enumerate  the  different  trees  of  size  C  in  G3.  To  begin,  a  node  is  chosen 
as  the  root.  Since  there  are  at  most  ql'^  nodes  in  G'3,  there  are  at  most  possible  roots. 
Next,  we  construct  the  tree  as  we  perform  a  depth-first  traversal  of  it.  Nodes  of  the  tree  are 
visited  one  at  a  time.  At  each  node  u  in  the  tree,  either  a  previously  unvisited  neighbor  of 
u  is  chosen  as  the  next  node  to  be  visited,  or  the  parent  of  u  is  chosen  to  be  visited  (at  the 
root,  the  only  option  is  to  visit  a  previously  unvisited  neighbor).  Thus,  at  each  node  there 
are  at  most  ways  to  choose  the  next  node.  Since  each  edge  in  the  tree  is  traversed  once 
in  each  direction,  and  there  are  f  —  1  edges,  the  total  number  of  different  trees  with  any  one 

rootis  atmost 

Any  tree  of  size  t  in  G3  corresponds  to  an  independent  set  of  size  f  in  Gi ;  moreover,  it 
corresponds  to  an  independent  set  of  C  critical  nodes  in  G'l .  We  Ccui  bound  the  probability 
that  all  of  the  nodes  in  any  particular  independent  subset  U  of  size  f  in  Gi  are  critical  as 
follows.  Let  Pc  be  the  probability  that  more  than  M  =  GT/I  -f  kr{I  A  T)T j {I y/\og  I) 
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packets  use  edge  g  inr  after  C  candidate  packets  to  use  ginr  have  been  assigned  delays. 
Then 


For  a  fixed  deviation,  kr(I  +  T)T / (/ylog/),  that  does  not  depend  on  C,  the  probability 
Pc  of  exceeding  this  deviation  is  maximized  when  C  is  maximized  —  i.e.,  when  C  = 
v[l  +  T),  implying  M  =  r(r  +  I)T{1  +  fc/^log  1)1 1.  Thus,  pc  <  Pr(i+Ty  Using 
the  Chemoff-type  bound  as  in  the  proof  of  Proposition  3.6,  with  p  =  r{I  -y  T)T/I  and 
-  =  k/ \AogI,  and  ko  =  k‘^/3,  we  have  pc  <  Pr{i+T)  =  Pr[S  >  (1  +  'y)p]  <  = 

^ -(A-2r(/+r)T)/(3/iog7)  <  ^-(fcoiog/)  <  ^-{koini)  ^  si^ce  T  >  log^  /  and  r  >  1.  Thus 

the  probability  that  the  event  for  g  and  r  becomes  critical  after  C  candidate  packets  to  use 
g  in  T  have  been  assigned  delays  is  at  most  1  //*® .  Since  the  nodes  in  U  are  independent 
in  Gx ,  the  corresponding  events  are  also  independent.  Hence  the  probability  that  all  of  the 
nodes  in  the  independent  set  are  critical  after  all  packets  are  assigned  delays  or  put  aside 
is  at  most  Thus  the  probability  that  there  exists  a  tree  of  size  I  in  Gz  is  at  most 

^jjAbisCjikoe  ^  ^j4-(ko-234)e  (gince  there  exists  at  most  different  trees  of  size  £  in 

G'z  and  b  <  Since  q  <V,'we  can  make  this  probability  less  than  1  jV^' ,  for  f  =  log  V, 
and  any  fixed  constant  £3'  >  0,  by  choosing  A;  to  be  a  sufficiently  large  constant.  Hence, 
with  probability  at  least  1  —  IjV^' ,  the  size  of  the  largest  spanning  tree  in  Gz  will  be  log  V. 

We  can  now  bound  the  size  of  the  largest  connected  component  in  Gi.  Since  (i)  the 
largest  connected  component  in  Gz  has  at  most  £  nodes,  with  probability  at  least  1-1  jV’^' ; 
(ii)  each  of  these  £  nodes  may  have  neighbors  in  G2  \  and  (iii)  each  node  in  G2  is  either 
in  Gz  or  is  a  neighbor  of  a  node  in  Gz,  the  largest  connected  component  in  G2  contains  at 
most  b^£  nodes,  with  probability  at  least  1-1  .  As  we  argued  before,  the  critical  nodes 

in  any  connected  component  of  Gi  are  connected  in  Gz-  Thus,  the  maximum  number  of 
critical  nodes  in  any  connected  component  of  Gi  is  at  most  h^£.  Since  each  of  these  nodes 
may  have  as  many  as  b  endangered  neighbors  (and  each  endangered  neighbor  is  adjacent 
to  a  critical  node),  and  since  £  =  log  V,  the  size  of  the  largest  connected  component  in  Gi 
is  at  most  b^£  <  log  V,  with  high  probabihty.  Q.E.D. 

Since  /  =  log  P  in  the  scope  of  this  lemma,  the  size  of  the  largest  connected  component 
in  Gi  is  at  most  (log  V)^^,  for  k  large  enough,  with  probability  at  least  1-1  ,  for  any 
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constant  (3'  >  0.  We  still  have  to  find  a  schedule  for  the  packets  not  in  P.  We  now  have  a 
collection  of  independent  subproblems  to  solve,  one  for  each  component  in  the  dependence 
graph.  We  can  use  Proposition  3.6  to  find  the  initial  delays  for  these  packets.  Since  each 
node  in  the  dependence  graph  corresponds  to  an  edge  in  the  routing  network,  a  component 
with  X  nodes  in  the  dependence  graph  corresponds  to  at  most  x,  and  possibly  fewer,  edges 
in  the  routing  network. 

We  apply  Proposition  3.6  to  each  of  the  independent  subproblems.  In  the  original  block, 
let  Tg  be  the  relative  congestion  of  edge  g  with  respect  to  the  packets  not  in  P  in  frames  of 
size  I  or  larger,  and  let  be  the  relative  congestion  of  edge  g  with  respect  to  the  packets  in 
subproblem  H  in  frames  of  size  /  or  larger  (r^  =  J2h  )•  Let  be  the  number  of  distinct 
edges  associated  with  subproblem  H,  for  all  H.  Note  that  logV  =  P^, 

since  /  =  log  V.  After  applying  Proposition  3.6  to  a  subproblem  H,  the  relative  congestion 
of  any  edge  g  with  respect  to  the  packets  in  H  in  frames  of  size  log^  I  or  larger,  in  between 
the  first  and  last  P  steps  in  the  final  schedule  is  at  most  r^(l  +  fci/\/log  I),  for  some 
constant  ki  >  0,  with  probability  at  least  1  —  q^  j P  >\  —  l/(log  for  any  constant 

e>o. 

Since  the  routing  subproblems  are  mutually  independent  and  disjoint,  if  we  apply 
Proposition  3.6  log  ^/(log  log  V)  times  to  each  of  the  at  most  A  <  "P  subproblems  (note 
that  each  packet  appears  in  at  most  one  subproblem),  then  for  any  constant  >  53,  and  V 
large  enough,  with  probabihty  at  least  1  —  A7(log  'p)(^-53)iogT>/(iogiog7’)  >  i  _ 
the  relative  congestion  of  edge  g  with  respect  to  the  packets  not  in  P,  in  any  frame  of  size 
log^  /  or  greater  is  at  most  Enec,  (1  +  h  /v/log  I)  =  r5(l  +  ki/y/\ogI). 

Applying  Proposition  3.6  log  P/ (log  log  P)  times  for  each  subproblem  takes  time 
+  ^^)(log^^)logP/loglogP)  =  (9(Q(log^P)(loglogP)),  since  /  =  logP 
and  Q  >  E/f  • 

We  now  have  schedules  for  the  packets  in  P  and  for  the  packets  not  in  P.  Fix  any  edge  g 
traversed  by  some  packet  in  the  block  and  a  T -frame  r,  where  T  G  [log^  /,  2  log^  f — 1].  The 
total  number  of  candidate  packets  in  P  to  use  edge  s'  in  r  after  the  delays  have  been  assigned 
is  given  by  Cg.  The  number  of  packets  in  P  that  traverse  edge  s  in  r  in  the  resulting 
schedule  is  at  most  CgT/I  -f  kr{I  -f  T)T l{I\/\ogI)  <  r{I  -\-  T)T{1  -t-  kly/\ogI)/I  < 
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rT{l  +  {2k  +  1)/ yiog  /),  since  (/  +  T)/I  <  (1  +  1/ (2-v/log/)),  for  /  large  enough,  and 
Cg  <  r{I  +  T).  Hence  the  relative  congestion  of  edge  s'  in  r  with  respect  to  the  packets  in 
P  is  atmostr(l  +  {2k  +  l)/v1og7). 

Now  we  consider  the  relative  congestion  of  the  packets  not  in  P.  W^th  probability  at 
least  1  -  1/V^'  -  for  any  positive  constants  jS'  and  there  exists  a  constant 

ki  such  that  the  number  of  packets  not  in  P  that  traverse  g  in  the  new  schedule  is  at  most 

rgT{\  +  kily/]E^)  <rT{l■^kl|^/\o^). 

Combining  the  relative  congestions  for  packets  in  P  and  not  in  P,  we  get  that  the 
relative  congestion  of  edge  G*  in  r  is  at  most  2r(l  +  m!ix{2k  +  1,  A:i)/^/log /).  Choose 
a  =  max{2k  +  1,  A;i)/yiog /.  Choose  /?  such  that  >  l/P^'  +  Hence,  for 

any  constant  /3  >  0,  there  exist  constants  k  and  ki  >  0  such  that  the  relative  congestion 
of  edge  ^  in  r  is  at  most  2r(l  +  a),  for  any  edge  g,  for  any  T-frame  r,  for  any  T  G 
[log^  7, 2  log^  7-1],  with  probability  at  least  1  —  1  /V^. 

The  number  of  time  steps  taken  by  the  algorithm  just  described  is  G(Q(log  log  V  + 
log7’)(log^P)(loglogP))  =  0{Q{\og^V){\og\og'P)).  Q;E.D, 

Proposition  3.7.2  Let  the  relative  congestion  in  any  frame  of  size  I  or  greater  be  at  most  r 
in  a  block  of  size  2P  +  2P  —  I,  where  I  <r  <  I  and  I  =  (log  log  V^.  Let  Q  be  the  sum 
of  the  lengths  of  the  paths  traversed  by  the  packets  in  this  block.  Then,  for  any  constant 

(3>Q 

1.  there  is  an  algorithm  for  assigning  initial  delays  in  the  range  [0,  7]  to  the  packets 
so  that  in  between  the  first  and  last  P  steps  of  the  block,  the  relative  congestion  in 
any  frame  of  size  log^  7  or  greater  is  at  most  r',  where  r'  =  2r(l  +  a)  and  a  = 

0{i)/VW; 

2.  this  algorithm  runs  in  Q(log  'P)(log  log  'P)®(log  log  log  time  steps,  with  prob¬ 

ability  at  least  1  —  1/V^. 

Proof:  The  first  part  of  the  proof  of  this  proposition  is  identical  to  the  part  where  we  assign 
delays  to  the  packets  in  P  in  the  proof  of  the  Proposition  3.7.1  (we  let  7  =  (log  log  VY  in 
that  proof). 
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However,  since  /  =  (log  log  we  need  to  make  an  additional  pass  assigning  de¬ 
lays  to  the  packets  in  this  proof,  in  order  to  reduce  the  component  size  in  the  depen¬ 
dence  graph  to  a  polynomial  in  I.  From  there,  we  proceed  by  applying  Proposition  3.6 
to  each  component  separately,  as  we  did  in  the  proof  of  Proposition  3.7.1.  In  the  first 
pass,  we  reduce  the  maximum  component  size  in  Gi  from  to  log  V  (by  taking 
(  =  logP,  as  in  Proposition  3.7.1),  with  probability  at  least  1  —  lIV^' ,  for  any  con¬ 
stant  y'  >  0.  In  the  second  pass,  we  reduce  the  component  size  from  /®^logP  down 
to  I  '-  log  log  V  <  by  taking  i  =  log  log  V,  and  noting  that  the  number  of  edges  in 
any  connected  component  is  at  most  P'^  log  V.  For  any  component,  this  step  will  suc¬ 
ceed  with  probability  at  least  1  —  I  j [P'^  \ogV)^' ,  for  any  constant  /?'  >  0.  To  make 
this  probability  as  high  as  it  was  in  the  case  I  —  log  V,  if  a  pass  fails  for  any  compo¬ 
nent,  we  simply  try  to  reduce  the  component  size  again,  up  to  log  P / (log  log  "P)  times. 
Then  with  probability  at  least  1  —  ,  for  any  constant  /?'  >  0,  we  have  reduced  the 

component  size  to  at  most  P^.  Since  (i)  for  each  packet  assigned  a  delay  in  these  two 
passes,  we  have  to  check  whether  the  event  for  an  edge  g  and  a  T-frame  r  becomes  critical, 
for  all  edges  g  traversed  by  the  packet,  for  all  T-frames  r,  for  T  e  [log^  /,  21og^  ^  “  1] 
(using  a  similar  procedure  to  that  in  the  last  paragraph  of  Proposition  3.6),  and  since 
(ii)  we  repeat  the  second  pass  (9(log  P/  log  log  P)  times,  the  two  passes  take  0{Q{P  + 
/-)(log"/)(logP)/(loglogP))  <  (9(Q(loglogP)®log2((loglogP)2)logP/(loglogP))  < 
<7(|()gp)(loglogP)^(logloglogP)*^^’'  time  steps. 

The  second  pass  adds  some  packets  to  the  set  P.  Let  Pi  and  P2  denote  the  number  of 
packets  assigned  delays  in  the  first  and  second  pass,  respectively.  Then  the  relative  conges¬ 
tion  due  to  these  packets  will  be  at  most  [(Pi  +  P2)P//  +  2kr{I  -|-  T)T( {I fT  < 
r(l  +  T)/I  +  2kr{I  +  <  r[l  +  T/I  +  2k{I  +  P)/(/vl^)]  <  r(l  +  {Ak  A 

l)/\/I^),sinceP<  21ogVand21ogV//  <  1/v/I^. 

If  the  two  passes  fail  to  achieve  the  desired  relative  congestion,  we  try  again. 

Now  we  apply  Proposition  3.6  up  to  log  P/(log  log  log  P)  times,  assigning  delays  to  the 
packets  not  in  P,  verifying  at  the  end  of  each  application  whether  the  schedule  obtained  has 
relative  congestion  r(l  -f  ki/\/\og  /),  for  some  constant  k\  to  be  specified  later.  Here  we 
need  to  apply  Proposition  3.6  up  to  log  P/ (log  log  log  P )  times  to  each  resulting  component 
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(rather  than  log'P/(loglogP)  as  in  the  proof  of  Proposition  3.7.1)  since  the  component 
size  now  is  0{F^)  =  (log  log  and  so  our  bound  on  the  failure  probability  for  each 

component  is  only  l/(loglog'P)®(^)  (since  the  bound  given  by  Proposition  3.6  is  at  best 
polynomially  small  in  /  and  /  =  (log  log  VY),  rather  than  1  /(log  The  assignment 

of  delays  to  the  packets  not  in  P  takes  at  most  Q(log  log  'P)®(log  log  log  'P)‘^(^)(log  V)  time 
steps,  since  Q  is  an  upper  bound  on  the  sum  of  the  number  of  edges  in  each  component. 
For  any  constant  (3'  >  0,  there  exists  a  constant  A:i  >  0  such  that  we  obtain  a  feasible 
schedule  for  these  packets  with  relative  congestion  r(l  -|-  ki / ylog  I)  with  probabihty  at 
least  1  —  IjV^' . 

We  have  schedules  for  the  packets  in  P  and  for  the  packets  not  in  P,  with  relative 
congestions  r(l  +  {4k  +  l)ly/log  I)  and  r(l  +  ki/\/\ogI),  respectively,  with  probability 
at  least  1  —  2/7^^  ,  for  any  constant  j3'  >  0.  The  two  schedules  can  be  found  in  at  most 
Q{\og  'P)(log  log  'P)®(log  log  log  time  steps.  When  we  merge  the  two  schedules,  the 

resulting  relative  congestion  may  be  as  large  as  the  sum  of  the  two  relative  congestions  — 
that  is,  the  resulting  relative  congestion  may  be  as  large  as  2r  ( 1 + max(4fc  + 1 ,  /ci ) /y/log  I), 
with  probability  at  least  1  —  1  /V^,  for  large  enough  constants  k  and  fci  >  0,  for  any  fixed 
Y  >  0  (choose  /?'  such  that  I /V^  >  2/'P^').  Let  a  =  max(4fc  +  1 ,  )  / -v^log  I. 

Q.E.D. 


3.3.3  Applying  exhaustive  search 

The  remaining  0(log*(c  +  d))  applications  of  Lemma  3.7  in  [27]  are  replaced  by  appli¬ 
cations  of  the  following  proposition,  which  uses  the  same  technique  as  Propositions  3.7.1 
and  3.7.2,  except  that  instead  of  using  Proposition  3.6  for  each  component  of  the  subgraph 
induced  by  critical  and  endangered  nodes  in  the  dependence  graph,  it  uses  the  Lovasz  Local 
Lemma  and  exhaustive  search  to  find  the  settings  of  the  delays  for  the  packets.  Proposi¬ 
tion  3.7.3  does  not  allow  a  constant  factor  increase  in  the  relative  congestion  of  the  refined 
schedule,  which  prevents  a  blowup  in  the  final  relative  congestion. 

Proposition  3.73  Let  the  relative  congestion  in  any  frame  of  size  I  or  greater  be  at  most  r 
in  a  block  of  size  2P  -|-  2P  —  I,  where  1  <  r  <  I  and  I  <  (log  log  log  Let  Q  be 
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the  sum  of  the  lengths  of  paths  taken  by  the  packets  in  this  block.  Then,  for  any  constant 

13  >% 

1.  there  is  an  algorithm  for  assigning  initial  delays  in  the  range  [0,  /]  to  the  packets  so 
that  the  relative  congestion  in  any  frame  of  size  log^  I  or  greater  in  between  the  first 
and  last  P  steps  in  the  resulting  schedule  is  at  most  r',  where  r'  =  r(l  +  cr)  and 
cr  =  (9(l)/v'log  I; 

2.  this  algorithm  runs  in  Q(log  'P)(log  log  log  P)^l'l(log  log  log  log  time  steps, 

with  probability  at  least  1  —  1  /V^. 

Proof:  The  proof  uses  the  Lovdsz  Local  Lemma  to  show  that  an  assignment  of  initial  delays 
satisfying  the  conditions  of  the  proposition  exists. 

We  first  assign  delays  to  some  packets  by  making  three  passes  through  the  packets 
using  the  algorithm  of  Proposition  3.7.1  (for  making  the  initial  pass  assigning  delays  to  the 
packets  in  P)  in  each  pass.  Let  1  <  ?  <  3,  be  the  number  of  candidate  packets  to  use 
edge  g  in  T  that  were  assigned  delays  in  the  ?th  pass.  After  the  first  pass,  we  have  that  (i) 
the  number  of  packets  assigned  delays  in  this  pass  that  use  edge  g  in  the  new  schedule  is  at 
most  +  kr{I  +  T)/ (/\/log  /),  and  (ii)  with  probability  at  least  1  —  1  for  any 

constant  /?'  >  0,  the  size  of  the  largest  component  in  the  dependency  graph  is  log  P. 

We  need  to  make  two  more  passes  assigning  delays  to  the  packets,  reducing  the  size 
of  the  largest  connected  component  first  to  P^{log  log  P),  and  then  to  /^^(log  log  log  P)  = 
(log  log  log  (since  I  <  (log  log  log  by  taking  (  =  log  log  P  in  the  second 

pass  and  £  =  log  log  log  P  in  the  third  pass.  If  we  fail  to  reduce  the  component  size  as 
desired,  the  second  pass  is  repeated  up  to  log  P/(log  logP)  times  and  the  third  pass  is 
repeated  up  to  log  P/(log  log  log  P)  times.  The  number  of  packets  assigned  delays  in  the 
second  (resp.,  third)  pass  that  traverse  edge  g  in  the  new  schedule  is  at  most  Cf^Tjl  + 
kr{l  +  r)/(/v3og/)  (resp.,  f I  +  kr{I  +  T)/(/\/log  /)).  As  before,  k  is  chosen 
large  enough  so  that  the  failure  probability  in  each  pass  is  at  most  1  /P^  ,  for  any  constant 
/?'>0. 

In  each  pass,  we  assign  a  random  delay  to  each  packet  and  check  whether  the  event  for 
any  edge  g  traversed  by  this  packet  and  any  P-frame  r,  where  T  €  [log^  /,  2  log^  -f  -  1]. 
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becomes  critical,  as  we  did  in  Propositions  3.7.1-2.  Thus  each  pass  takes  time  0{Q{P  + 
/^)(log^  /))  =  2(log  log  log  7’)*^(^l(log  log  log  log  For  any  constant  /?  >  0,  choose 

/?'  such  that  1/V^  >  3/V^'.  Hence,  since  we  repeat  the  second  and  third  passes  up  to 
log  V / (log  log  V)  and  log  V/ (log  log  log  "P),  respectively,  we  succeed  in  reducing  the  com¬ 
ponent  size  to  (log  log  log  in  Q(log  P)(log  log  log  P)‘^(^l(log  log  log  log  time 

steps,  with  probability  at  least  1  —  1/P^.  Let  P  be  the  number  of  packets  assigned  delays 
in  the  three  passes. 

We  now  use  the  Lovdsz  Local  Lemma  to  show  that  there  exists  a  way  of  completing  the 
assignment  of  delays  (i.e,,  to  assign  delays  to  the  packets  not  in  P)  so  that  the  relative  con¬ 
gestion  in  frames  of  size  log^  I  or  greater  in  this  block  is  at  most  r(l  +  0(l)/-\/log  I). 
We  associate  a  bad  event  with  each  edge  and  each  time  frame  of  size  log^  I  through 
21og^  7—1.  The  bad  event  for  an  edge  g  and  a  particular  T-frame  r  occurs  when  more 
than  Mg  —  (r(7  T)  —  Pg)T j I  kr{I  +  T)TI{I\/]E^)  packets  not  in  P  use  edge  g  in 
T,  where  Pg  is  the  number  of  packets  in  P  that  traverse  edge  g  during  r  after  the  delays 
have  been  assigned  to  the  packets  in  P  (note  that  there  are  at  most  r{T  1)  —  Pg  candidate 
packets  not  in  P  to  use  edge  g  in  r).  As  we  argued  in  the  proof  of  Proposition  3.7.1 ,  the  to¬ 
tal  number  of  bad  events  involving  any  one  edge  is  at  most  L.  We  show  that  if  each  packet 
not  in  P  is  assigned  a  delay  chosen  randomly,  independently,  and  uniformly  from  the  range 
[0, 7],  then  with  nonzero  probability  no  bad  event  occurs.  In  order  to  apply  the  lemma,  we 
must  bound  both  the  dependence  of  the  bad  events,  and  the  probability  that  any  bad  event 
occurs.  The  dependence  h  is  at  most  P^,  as  argued  before.  For  any  edge  g  and  r-frame 
r  that  contains  g,  where  log^  I  <  T  <  (21og^  “  1>  die  probability  pg  that  more  than 

Mg  packets  not  in  P  use  g  in  r,  can  be  shown  to  be  at  most  1  / 7^^,  for  sufficiently  large  k, 
using  exactly  the  same  Chemoff-bound  argument  that  was  used  in  Proposition  3.7.1.  Thus, 
Am.&y.g^G{Pg}b  <  4/7  <  1  (for  7  >  4).  Hence,  since  maxpecfe}  is  an  upper  bound  on 
the  probability  of  any  bad  event  occurring,  by  the  Lovasz  Local  Lemma,  there  is  some  way 
of  assigning  delays  to  the  packets  not  in  P  so  that  no  bad  event  occurs. 

Since  at  most  r(T  -f  7)  packets  pass  through  the  edge  associated  with  any  critical  node, 
and  there  are  at  most  (7+1)  choices  for  the  delay  assigned  to  each  packet,  the  number 
of  different  possible  assignments  for  any  subproblem  containing  (log  log  log  critical 
nodes  is  at  most  (7  +  i)ri^+r)(iogiogiog7>)0(‘)  <  /4/2(iogiogiog^)0(i)  r  <  7  and  T  < 
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2  log^  I).  For  I  <  (log  log  log  and  V  larger  that  some  constant,  this  quantity  is 

smaller  than  (logP)"^,  for  any  fixed  constant  7  >  0.  Hence,  we  need  to  try  out  at  most 
log'''  V  possible  delay  assignments. 


After  assigning  delays  to  all  of  the  packets,  the  number  of  packets  that  use  an  edge  g  in 
any  T -frame  r  is  at  most 


kriI  +  T)T\  (r{I  +  T)-P,)T  kr{I  +  T)T 

IvW  )  I  IVW 

^  r{I  +  T)T  ikrjl  +  T)T 
~  I  /^/log  / 

with  probability  at  least  1  —  1/P^,  since  each  packet  is  assigned  a  delay  exactly  once,  and 
thus  r(/  +  T)  —  Pg  A  <  r(I  -I-  T).  Thus  the  relative  congestion  in  any 

T-frame,  for  log^  I  <T  <2  log^  I,  is  at  most 


4k 

\/logf 


) 


=  r 


{8k  + 

J 


=  ril  +  0-), 


by  taking  a  =  {8k  -F  1 )/ (/-v/log  I),  since  2  log^  I  j  I  <  1  /  A/log  /,  for  I  large  enough. 

We  can  bound  the  total  number  of  time  steps  taken  by  the  algorithm  as  follows.  The  first 
three  passes  take  time  Q(log  P)(log  log  log  P)‘^(^l(log  log  log  log  with  probability 

at  least  1  —  1/P^.  After  the  third  pass,  we  solve  subproblems  containing  (log  log  log 
critical  nodes  exhaustively.  For  each  subproblem,  for  each  of  the  at  most  log^  P  possible  as¬ 
signment  of  delays  to  the  packets  in  the  subproblem,  for  each  of  the  at  most  {P-{-P)  log^  I 
r-frames  r  in  the  subproblem,  log^  1  <  T  <  2  log^  7,  and  for  every  edge  g  in  r,  we 
check  whether  more  than  Mg  packets  traverse  g  during  r  (using  the  procedure  described 
in  the  proof  of  Proposition  3.6).  This  takes  time  0{Q{P  -|-  /^)(log^  f  )(log'^  P)),  which  is 
at  most  Q(logloglogP)‘^^^^(loglogloglogP)‘^^*^(log'*  P),  for  P  large  enough,  for  any 
fixed  7  >  0  (since  the  sum  of  the  number  of  distinct  edges  in  each  subproblem  is  at 
most  Q,  and  since  /  =  (log  log  log  In  particular,  for  7  =  1,  this  quantity  is 

bounded  by  Q(log  P)(log  log  log  P)*^*^*(log  log  log  log  Hence  the  algorithm  runs 
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in  2(log'P)(logloglog'P)*^(^)(loglogloglog'P)‘^(^)  time  steps,  with  probability  at  least 
1  —  1  jV^,  for  any  constant  (3  >  (i. 

Q.E.D. 


3.3.4  Moving  the  block  boundaries 

Now  we  present  the  three  replacement  propositions  for  Lenuna  3.9  of  [27],  which  bounds 
the  relative  congestion  after  we  move  the  block  boundaries  (as  in  [27]).  The  three  proposi¬ 
tions  that  follow  are  analogous  to  the  three  replacement  propositions.  Propositions  3.7. 1-3, 
for  Lemma  3.7  of  [27].  The  necessary  changes  in  the  proof  of  Lemma  3.9  of  [27],  in  places 
where  the  Lovasz  Local  Lemma  is  used,  are  analogous  to  the  changes  made  in  the  proof  of 
Lemma  3.7  of  [27],  for  the  cases  /  =  log  "P,  /  =  (log  log  P)^,  and  /  =  (log  log  log 
Therefore,  we  omit  the  proofs  of  Propositions  3. 9. 1-3. 

Suppose  we  have  a  block  of  size  2P  -|-  SP,  obtained  after  the  insertion  of  delays  into 
the  schedule  as  described  in  Propositions  3.7.1,  3.7.2,  or  3.7.3,  according  to  the  current 
value  of  I.  Then  suppose  we  move  the  block  boundaries  as  described  in  [27].  Each  Propo¬ 
sition  3.9.1-3  also  refers  to  a  specific  size  of  I.  Note  that  in  [27],  the  steps  between  steps 
P  and  P  -t-  2P  in  the  block  are  called  the  “fuzzy  region”  of  the  block.  We  assume  that 
the  relative  congestion  in  any  frame  of  size  I  or  greater  in  the  block  is  at  most  r,  where 
I  <r  <  I.  Let  Q  be  the  sum  of  the  lengths  of  the  paths  taken  by  the  packets  in  the  block. 

Proposition  3.9.1  For  I  =  log  P,  for  any  constant  /?  >  0 

1.  there  is  an  algorithm  for  assigning  delays  in  the  range  [0,  P]  to  the  packets  such  that 
in  between  steps  I  log®  /  and  P  and  in  between  steps  P-\~3P  and  2/®-|-3/^— /  log®  I, 
the  relative  congestion  in  any  frame  of  size  log^  /  or  greater  is  at  most  2r(l  +  ai), 
where  oi  =  O(l)/v^lo^,  and  such  that  in  between  steps  P  and  P  SP,  the 
relative  congestion  in  any  frame  of  size  log®  I  or  greater  is  at  most  2r{l  + 02),  where 
02  =  0(l)/Vlog  /; 

2.  this  algorithm  runs  in  0(Q(log  P)^(log  log  P))  time  steps,  with  probability  at  least 
1  -  1/P^. 
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Proposition  3.9.2  For  I  =  (log  log  Vy,  for  any  constant  /3  >  0, 

1.  there  is  an  algorithm  for  assigning  delays  in  the  range  [0,  P]  to  the  packets  such  that 

in  between  steps  I  log^  /  and  P  and  in  between  steps  P-\-3P  and  2P-\-ZP—I\o^  I, 
the  relative  congestion  in  any  frame  of  size  log^  /  or  greater  is  at  most  2r(l  +  (7\), 
where  =  0(l)/\/log  1,  and  such  that  in  between  steps  P  and  P  +  3P,  the 
relative  congestion  in  any  frame  of  size  log^  /  or  greater  is  at  most  02),  where 

02  =  O(l)/v/log  /; 

2.  this  algorithm  runs  in  Q(log  'P)(log  log  'P)®(log  log  log  time  steps,  with  prob¬ 

ability  at  least  1  —  1  jV^. 

Proposition  3.93  For  /  =  (log  log  log  for  any  constant  (S  >  0, 

1.  there  is  an  algorithm  for  assigning  delays  in  the  range  [0,  P]  to  the  packets  such  that 
in  between  steps  I  log^  /  and  P  and  in  between  steps  P-\-3P  and  2P-\-ZP  —  I  log^  I, 
the  relative  congestion  in  any  frame  of  size  log^  /  or  greater  is  at  most  r(l  +  oi), 
where  oi  =  0(l)/\/log  I,  and  such  that  in  between  steps  P  and  P  +  SP,  the 
relative  congestion  in  any  frame  of  size  log^  I  or  greater  is  at  most  r(l  +02),  where 

02  =  0{l)/Vi^: 

2.  this  algorithm  runs  in  Q(log  P)(log  log  log  P)'^*’*(log  log  log  log  time  steps, 

with  probability  at  least  1  —  1  /P^. 

3.4  Running  time 

Theorem  1  For  any  constant  ^  >  0,  the  algorithm  for  finding  an  0{c  +  d)-steps  schedule 
of  the  packets  takes  0{m{c  +  d){logVy  {log  log  P))  time  steps  overall,  with  probability  at 
least  1  —  1/P'^. 

Proof:  For  any  constant  (3  >  0,  we  place  an  upper  bound  on  the  number  of  time  steps  taken 
by  the  application  of  Proposition  3.2,  followed  by  the  applications  of  Propositions  3.7.1, 
3.9.1,  3.7.2,  and  3.9.2,  then  followed  by  the  applications  of  Propositions  3.7.3  and  3.9.3. 
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The  application  of  Proposition  3.2  takes  0(m(c  +  d)  log  V)  time  steps,  with  probability  at 
least  1  —  l/'P^.  Each  of  the  Propositions  3.7.1-3,  and  each  of  the  Propositions  3.9. 1-3  dealt 
with  a  single  block.  For  any  I,  partitioning  the  schedule  into  disjoint  blocks  and  moving 
the  block  boundaries  as  described  in  [27]  take  O('P)  time.  Let  nj  be  the  number  of  blocks 
in  the  schedule  for  any  given  I. 

We  place  an  upper  bound  on  the  number  of  time  steps  taken  by  the  applications  of 
Propositions  3.7. 1-3  and  3.9.1-3  as  follows.  Assume  the  blocks  are  numbered  from  1 
to  nj.  Note  that  Qi  =  where  Q,  is  the  sum  of  the  lengths  of  the  paths  tra¬ 
versed  by  the  packets  in  block  i.  Thus  the  applications  of  Proposition  3.7.1  and  3.9.1  take 
0{V{\ogVY{\og\ogV))  steps;  and  the  applications  of  Propositions  3.7.2  and  3.9.2  take 
Vilog  'P)(log  log  'P)®(log  log  log  steps.  For  each  partition  of  the  schedule  for  a  given 

I  <  (log  log  log  we  apply  Propositions  3.7.3  and  3.9.3  to  every  block  i  in  this  parti¬ 

tion,  1  <  i  <  n/,  taking  overall  time  P(logP)(log  log  log  P)*^(^l(log  log  log  log 
Since  we  will  repartition  the  schedule  (9(log*(c  -f  d))  times  after  we  bring  I  down  to 
(log  log  log  the  overall  running  time  due  to  applications  of  Propositions  3.7.3  and 

3.9.3  is  P(logP)(log  log  log  (log  log  log  logP)®l^Mog*(c -t-  d). 

Choose  ^  >  0  such  that  1/P^  >  O(log*(c  -|-  d))fV^.  Thus  the  total  number  of  time 
steps  taken  by  the  algorithm  is  0(m(c  d) (log  P)^ (log  log  P)),  for  P  large  enough,  with 
probability  at  least  1  —  1/P^,  for  any  constant  d  >  0.  Note  that  we  used  the  inequalities 
T*  >  c,  P  >  d,  and  P  <  m(c  -f-  d).  Q.E.D. 


3.5  A  parallel  scheduling  algorithm 


At  first  glance,  it  seems  as  though  the  algorithm  that  was  described  in  Section  3.3  is  inher¬ 
ently  sequential.  This  is  because  the  decision  concerning  whether  or  not  to  assign  a  delay  to 
a  packet  is  made  sequentially.  In  particular,  a  packet  is  deferred  (i.e.,  not  assigned  a  delay) 
if  and  only  if  the  packet  might  be  involved  in  an  event  —  i.e.,  the  packet  traverses  an  edge 
that  corresponds  to  an  event  —  that  became  critical  because  of  the  delays  assigned  to  prior 
packets. 
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In  [1],  Alon  describes  a  parallel  version  of  Beck’s  algorithm  which  proceeds  by  assign¬ 
ing  values  to  all  random  variables  (in  this  case  delays  to  all  packets)  in  parallel,  and  then 
unassigning  values  to  those  variables  that  are  involved  in  bad  events.  The  Alon  approach 
does  not  work  in  this  application  because  we  cannot  afford  the  constant  factor  blow-up  in 
relative  congestion  that  would  result  from  this  process. 

Rather,  we  develop  an  alternative  method  for  parallelizing  the  algorithm.  The  key  idea 
is  to  process  the  packets  in  a  random  order.  At  each  step,  all  packets  that  do  not  share  an 
edge  with  an  as-yet-unprocessed  packet  of  higher  priority  are  processed  in  parallel. 

To  analyze  the  parallel  running  time  of  this  algorithm,  we  first  make  a  dependency  graph 
G'  with  a  node  for  every  packet  and  an  edge  between  two  nodes  if  the  corresponding  packets 
can  be  involved  in  the  S£ime  event.  Each  edge  is  directed  towards  the  node  corresponding 
to  the  packet  of  lesser  priority.  By  Brent’s  Theorem  [9],  the  parallel  running  time  of  the 
algorithm  is  then  at  most  twice  the  length  of  the  longest  directed  path  in  G' . 

Let  D  denote  the  maximum  degree  of  G'.  There  are  at  most  ND^  paths  of  length  L 
in  G'.  The  probability  that  any  particular  path  of  length  L  has  all  of  its  edges  directed 
in  the  same  way  is  at  most  2/Z!  (the  factor  of  2  appears  because  there  are  two  possible 
orientations  for  the  edges).  Hence,  with  probability  near  1,  the  longest  directed  path  length 
in  G'  isO{D  -f  log  N).  This  is  because  if  L  >  A  ( Z)  -t-  log  N),  for  some  large  constant  k, 
then  ND^  •  n  <  1- 

Each  packet  can  be  involved  in  at  most  {2P  +  2P){2P  -\-  P)  log^  I  events,  and  at  most 
r{I  +  T)  <  0(1)  packets  can  be  involved  in  the  same  event.  Hence,  the  degree  D  of  G'  is 
at  most  0{P  log^  I).  By  using  the  method  of  Proposition  3.2  as  a  preprocessing  phase,  we 
can  assume  that  c,  d,  and  thus  I,  are  all  polylogarithmic  in  V.  Hence,  the  parallel  algorithm 
runs  in  NC,  as  claimed. 


3.6  Concluding  remarks 


Our  algorithm  for  packet  scheduling  can  also  be  used  to  route  messages  that  are  composed 
of  sequences  of  packets.  This  is  possible  since  our  algorithm  can  easily  maintain  the  prop- 
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erty  that  any  two  packets  traveling  along  the  same  path  to  the  same  destination  always 
proceed  in  order. 

The  algorithms  described  in  this  chapter  are  randomized,  but  they  can  be  derandomized 
using  the  method  of  conditional  probabilities  [43, 52]. 


Chapter  4 

New  Approximation  Techniques  for 
Some  Ordering  Problems 


4.1  Introduction 


In  this  chapter,  we  address  the  minimum  linear  arrangement  problem,  an  optimization  prob¬ 
lem  that  arises  in  embeddings  of  networks  into  a  linear  array.  Let  G  be  a  network  with 
associated  nonnegative  edge  weights.  The  weight  of  an  edge  can  represent  the  capacity,  or 
the  cost  of  communication  through  the  edge.  Informally,  a  minimum  linear  arrangement  of 
G  is  an  embedding  of  G  into  the  linear  array  such  that  (i)  we  have  a  one-to-one  mapping 
from  the  nodes  of  G  to  the  nodes  of  the  linear  array,  and  (ii)  the  weighted  sum  of  the  edge 
dilations  —  that  is,  the  cost  of  the  linear  arrangement  —  is  minimum.  In  Figure  4.1,  we 
show  a  linear  arrangement  a  for  the  network  G  with  cost  28  (in  fact  this  linear  arrangement 
is  a  minimum  linear  arrangement  of  G). 

As  we  saw  in  Chapter  1,  a  guest  network  G  can  be  emulated  by  a  host  network  H  by 
embedding  G  into  H.  The  slowdown  of  an  emulation  is  given  by  the  ratio  between  the 
number  t'  of  steps  on  H  needed  to  emulate  any  t  steps  of  computation  on  G.  We  would 
like  the  slowdown  to  be  as  small  as  possible.  The  slowdown  of  an  emulation  is  closely 
related  to  the  dilations  of  the  edges  in  the  associated  embedding:  The  dilation  of  an  edge 

This  is  joint  work  with  Satish  Rao,  NEC  Research  Institute;  a  preliminary  version  of  this  work  appears 
in  [47]. 
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Figure  4.1:  A  graph  G  and  a  minimum  linear  arrangement  a  of  G. 

introduces  an  extra  factor  in  the  cost  of  communication  between  the  endpoint  nodes  of  this 
edge  in  the  emulation. 

Note  that,  as  in  the  problem  addressed  in  Chapter  2,  the  idea  of  preserving  locality  in 
order  to  minimize  the  use  of  shared  resources  in  the  network  also  arises  in  the  minimum 
linear  arrangement  problem  —  i.e.,  we  would  like  that  nodes  that  are  “close”  in  the  network 
G  also  be  “close”  in  the  linear  array,  in  order  to  minimize  the  average  edge  dilation. 

Finding  a  minimum  linear  arrangement  is  NP-hard,  even  for  the  case  when  all  the  edges 
have  unit  weight.  An  a-approximation  algorithm  is  an  algorithm  that  finds  a  solution  to 
the  problem  whose  cost  is  at  most  a  times  the  cost  of  an  optimal  solution  to  the  prob¬ 
lem.  In  this  chapter,  we  present  a  polynomial-time  (9(log  n) -approximation  algorithm  for 
finding  a  minimum  linear  arrangement  of  an  n-node  network,  improving  on  the  best  pre¬ 
vious  approximation  bound  of  Even,  Naor,  Rao,  and  Schieber  [13]  for  this  problem  by  a 
0(log  log  n)  factor. 

If  the  network  is  planar  (or,  more  generally,  if  it  excludes  AV,r  as  a  minor,  for  fixed  r, 
where  is  the  r  x  r  complete  bipartite  graph),  we  achieve  an  C>(log  log  n) -approximation 
factor  for  the  minimum  hnear  arrangement  problem,  using  a  variation  of  the  algorithm 
presented  for  the  general  case.  We  obtain  this  improvement  by  combining  the  techniques 
used  for  the  general  case  with  the  algorithm  presented  by  Klein,  Plotkin,  and  Rao  [23]  for 
finding  separators  in  graphs  that  exclude  fixed  /irr,r-nunors,  as  presented  in  Section  4.5. 
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We  extend  our  approximation  techniques  (and  bounds)  to  two  other  problems  that  in¬ 
volve  finding  a  linear  ordering  of  the  nodes  of  a  graph:  the  minimum  containing  interval 
graph,  and  the  minimum  storage-time  product  problems.  Using  techniques  from  [48], 
we  can  view  the  minimum  containing  interval  graph  problem  (which  we  define  formally 
in  Section  4.7)  as  a  “vertex  version”  of  the  minimum  linear  arrangement  problem  (see 
[13]).  Thus,  we  also  obtain  an  0(log  ??) -approximation  algorithm  for  general  graphs,  and 
an  0(  log  log  n ) -approximation  algorithm  for  graphs  that  exclude  fixed  AV.r -minors  for  this 
problem.  This  improves  on  the  best  known  previous  approximation  bounds  for  this  prob¬ 
lem  of  0(log  n  log  log  n)  for  general  graphs  [13],  and  of  0(log  n)  for  graphs  that  exclude 
fixed  A’,. .r -minors. 


We  can  also  use  techniques  from  [48]  to  extend  our  ideas  to  produce  an  (9(log  T)- 
approximation  algorithm  for  the  minimum  storage-time  product  problem  (defined  in  Sec¬ 
tion  4.6),  improving  on  a  previous  approximation  bound  of  O(log  T  log  log  T)  [13],  where 
7'  is  the  sum  of  the  processing  times  of  all  tasks.  The  minimum  storage-time  product  prob¬ 
lem  also  generalizes  the  minimum  linear  arrangement  problem.  The  techniques  in  [23]  do 
not  apply  to  directed  graphs;  therefore,  the  approach  used  in  the  two  former  problems  that 
led  to  better  approximation  bounds  for  graphs  that  exclude  AV,r-minors  does  not  apply  to 
the  minimum  storage-time  product  problem. 


Our  approximation  techniques  rely  on  a  lower  bound  W  on  the  cost  of  an  optimal 
solution  provided  by  a  spreading  metric  (to  be  defined  soon),  for  each  of  the  problems 
considered:  We  find  a  solution  to  the  problem  that  has  cost  0{W  log  n)  {0{W  log  T)  for 
the  minimum  storage-time  product  problem).  Alon  and  Seymour  [50]  showed  that  there 
exists  a  logarithmic  gap  between  the  lower  bound  provided  by  any  spreading  metric,  and 
the  true  optimal  values  for  certain  instances  of  the  problems  of  minimnm  linear  arrange¬ 
ment,  minimum  containing  interval  graph,  and  minimum  storage-time  product.  Thus  we 
provide  an  existentially  tight  bound  on  the  relationship  between  the  lower  bound  provided 
by  spreading  metrics  and  the  true  optimal  values  for  these  problems. 
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4.1.1  Previous  work 

Leighton  and  Rao  [29]  presented  an  O(log  n) -approximation  algorithm  for  balanced  par¬ 
titions  of  graphs.  Among  other  applications,  this  provided  0(log^  n) -approximation  al¬ 
gorithms  for  the  minimum  feedback  arc  set,  and  for  the  minimum-cut  linear  arrange¬ 
ment  problem.  Hansen  [19]  used  the  ideas  in  [29]  to  present  O(log^  n) -approximation 
algorithms  for  the  minimum  linear  arrangement  problem,  and  for  the  more  general  prob¬ 
lem  of  graph  embeddings  in  d-dimensional  meshes.  Ravi,  Agrawal,  and  Klein  [48]  pre¬ 
sented  polynomial-time  approximation  algorithms  that  deliver  a  solution  with  cost  within 
an  0{logn\ogT)  factor  from  optimal  for  the  minimum  storage-time  product  problem, 
where  T  is  the  sum  of  the  processing  times  of  all  tasks,  and  within  an  O(log^  n)  factor 
from  optimal  for  the  minimum  containing  interval  graph. 

Seymour  [50]  was  the  first  to  present  a  directed  graph  decomposition  divide-and-conquer 
approach  that  does  not  rely  on  balanced  cuts.  He  presented  a  polynomial-time  0(log  n 
log  log  n) -approximation  algorithm  for  the  minimum  feedback  arc  set  problem.  Even, 
Naor,  Rao,  and  Schieber  [13]  extended  the  spreading  metric  approach  used  by  Seymoiu' 
to  obtain  polynomial-time  O(log  n  log  log  n) -approximation  algorithms  for  the  minimum 
linear  arrangement,  and  the  minimum  containing  interval  graph  problems,  and  an  O(log  T 
log  log  T) -approximation  algorithm  minimum  storage-time  product  problem.  Even  et  al. 
actually  showed  similar  approximation  results  for  a  broader  class  of  graph  optimization 
problems,  namely  for  the  problems  that  satisfy  their  “approximation  paradigm”:  A  graph 
optimization  problem  satisfies  this  paradigm  if  (i)  the  divide-and-conquer  approach  pre¬ 
sented  by  Even  et  al.  is  applicable  to  the  problem;  and  (ii)  a  spreading  metric  that  provides 
a  lower  bound  on  the  cost  of  an  optimal  solution  to  the  problem  can  be  computed  in  polyno¬ 
mial  time '  They  defined  spreading  metrics  that  led  to  polynomial-time  algorithms  for  these 
problems  with  an  O(min{log  W  log  log  W,  log  k  log  log  k})  approximation  bound,  where  k 
denotes  the  number  of  “interesting”  nodes  in  the  problem  instance  (clearly  k  <  n),  and 
W  is  the  lower  bound  on  the  cost  of  a  solution  to  the  optimization  problem  provided  by 
a  spreading  metric.  Examples  of  such  problems,  besides  the  ones  already  mentioned,  are 
graph  embeddings  in  d-dimensional  meshes,  symmetric  multicuts  in  directed  networks, 
fc-multiway  separators  and  ^-separators  (for  small  values  of  p)  in  directed  graphs.  For  a 
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detailed  description  of  each  of  those  problems,  see  [13]. 

Even,  Naor,  Rao,  and  Schieber  [14]  also  extended  the  spreading  metric  techniques  to 
graph  partitioning  problems.  They  used  simpler  recursions  that  yield  a  logarithmic  approx¬ 
imation  factor  for  balanced  cuts  and  multiway  separators.  However,  they  were  not  able  to 
extend  this  simpler  technique  to  obtain  a  logarithmic  approximation  bound  for  the  other 
problems  considered  in  [13]. 

4.1.2  Spreading  metrics  and  our  recursion 

Our  algorithms  use  an  approach  that  relies  on  spreading  metrics.  Spreading  metrics  have 
been  used  in  recent  divide-and-conquer  techniques  to  obtain  improved  approximation  algo¬ 
rithms  for  several  graph  optimization  problems  that  are  NP-hard  [13, 50].  These  techniques 
perform  the  divide  step  according  to  the  cost  of  a  solution  to  the  subproblems  generated, 
rather  than  according  to  the  size  of  such  subproblems. 

A  spreading  metric  on  a  graph  is  an  assignment  of  lengths  to  the  edges  or  nodes  of  the 
graph  that  has  the  property  of  “spreading  apart”  (with  respect  to  the  metric  lengths)  all  the 
nontrivial  connected  subgraphs.  The  volume  of  the  spreading  metric  is  the  sum,  taken  over 
all  edges  (resp.,  nodes),  of  the  length  of  each  edge  (resp.,  node)  multiplied  by  its  weight. 

For  each  of  the  optimization  problems  we  consider  here.  Even,  Naor,  Rao,  and  Schieber 
[13]  defined  a  spreading  metric  of  volume  If  such  that  If  is  a  lower  bound  on  the  cost  of 
a  solution  to  the  problem.  Our  techniques  are  based  on  showing  that  a  spreading  metric 
of  volume  W  can  be  used  to  find  a  solution  with  cost  (9(lf  "log7?)  (0(1^  logT),  for  the 
minimum  storage-time  product  problem). 

We  develop  a  recursion  where  at  each  level  we  identify  cost  which,  if  incurred,  yields 
subproblems  with  reduced  spreading  metric  volume.  Specifically,  we  present  a  divide-and- 
conquer  strategy  where  the  cost  of  a  solution  to  a  problem  at  a  recursive  level  is  C  plus 
the  cost  of  a  solution  to  the  subproblems,  and  where  the  spreading  metric  volume  on  the 
subproblems  is  less  than  the  original  volume  by  H(C/  log  n)  (resp.,  0(67  log  T)  for  the 
minimum  storage-time  product  problem).  We  will  show  that  this  ensures  that  the  resulting 
solution  has  cost  0(log  n)  (resp.,  0(log  T))  times  the  original  spreading  metric  volume. 
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The  recursion  is  based  on  divide-and-conquer  —  that  is,  we  find  an  edge  set  whose 
removal  divides  the  graphs  into  subgraphs,  and  then  recursively  order  the  subgraphs.  The 
cost  of  a  recursive  level  is  the  cost  associated  with  the  edges  in  the  cut  selected  at  this 
level.  Previous  recursive  methods  and  analyses  proceeded  by  finding  a  small  cutset  of  edges 
where  the  maximum  spreading  metric  volumes  of  the  subproblems  were  quickly  reduced. 
We  proceed  by  finding  a  sequence  of  cutsets  whose  total  cost  can  be  upper  bounded,  say 
by  a  quantity  C,  and  whose  total  spreading  metric  volume  is  fl ((7/  log  n)  (^((7/  log  T)  for 
the  minimum  storage-time  product  problem),  as  stated  above.  The  crux  of  the  argument  is 
that  the  cost  associated  with  an  edge  in  a  cutset  can  be  bounded  by  the  number  of  nodes 
between  the  previous  and  the  next  cutset  in  the  sequence. 

We  point  out  that  the  methods  in  [13]  applied  to  more  problems,  including  the  d- 
dimensional  graph  embedding  and  the  minimum  feedback  arc  set  problems  [50].  We  could 
not  extend  our  methods  to  these  other  problems,  since  we  were  unable  to  find  a  suitable 
bound  on  the  cost  of  a  sequence  of  cutsets  associated  with  any  of  these  problems. 

Finally,  for  planar  graphs  and  other  graphs  that  exclude  some  fixed  minors,  we  combine 
a  structural  theorem  of  Klein,  Plotkin,  and  Rao  [23]  with  our  new  recursion  techniques,  to 
show  that  the  spreading  metric  cost  volumes  are  within  an  C>(log  log  n)  factor  of  the  cost 
of  the  optimal  solution  for  the  minimum  linear  arrangement  and  the  minimum  containing 
interval  graph  problems. 


4.1.3  Overview 

We  present  a  formal  definition  of  the  minimum  linear  arrangement  problem  in  Section  4.2. 
In  Section  4.3,  we  define  the  spreading  metric  used  for  this  problem;  in  Section  4.4,  we 
present  a  polynomial-time  O(log  n)-approximation  algorithm  for  the  minimum  linear  ar¬ 
rangement  problem  on  an  arbitrary  graph  with  n  nodes  and  nonnegative  edge  weights.  In 
Section  4.5,  we  show  how  to  improve  this  approximation  factor  to  0(log  log  n),  in  case 
the  graph  has  no  fixed  -minors  —  e.g.,  the  graph  is  planar.  In  Sections  4.6  and  4.7, 
we  define  and  briefly  discuss  the  algorithms  for  approximating  the  minimum  storage-time 
product  problem  and  minimum  containing  interval  graph  problem  respectively. 
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4.2  The  problem 

The  minimum  linear  arrangement  (MLA)  problem  is  defined  as  follows:  Given  an  undi¬ 
rected  graph  G{V,  E),  with  n  nodes,  and  nonnegative  edge  weights  u;(e),  for  all  e  in  E,  we 
would  like  to  find  a  linear  arrangement  of  the  nodes  a:V  {l,...,n}  that  minimizes  the 
sum,  over  all  {i,j)  in  E,  of  the  weighted  edge  lengths  \cr(i)  —  cr{j)\.  In  other  words,  we 
would  like  to  minimize  the  cost 


(i,3)€E 

of  a  linear  arrangement  a .  In  the  context  of  VLSI  layout,  |  ( 0  ~  (i )  I  represents  the  length 
of  the  interconnection  between  i  and  j. 


4.3  Spreading  metric 

In  this  section,  we  define  the  spreading  metric  used  in  the  algorithms  for  the  MLA  problem 
presented  in  Sections  4.4  and  4.5.  Analogous  functions  are  used  when  approximating  the 
minimum  storage-time  product  problem  (as  presented  in  Section  4.6),  and  the  minimum 
containing  interval  graph  problem  (see  Section  4.7). 

Here  we  present  spreading  metrics  in  the  context  of  the  MLA  problem  (see  [13]  for  a 
more  general  definition).  A  spreading  metric  for  the  MLA  problem  is  a  function  £  :  E  -i-  Q 
that  assigns  rational  lengths  to  every  edge  in  E,  and  that  can  be  computed  in  polynomial 
time.  It  also  satisfies  the  two  properties  below.  The  volume  of  a  spreading  metric  £  is  given 

byEe€£«^(eK(e). 

1.  Diameter  guarantee:  Let  the  distances  be  measured  with  respect  to  the  lengths  £{e). 
The  distances  induced  by  the  spreading  metric  “spread”  the  graph  and  all  its  nontriv¬ 
ial  subgraphs.  In  this  application,  this  translates  to  “The  diameter  of  every  nontrivial 
connected  subgraph  U  of  V  is  n(|(/|)”. 

2.  Lower  bound:  The  minimum  volume  of  a  spreading  metric  is  a  lower  bound  on  the 
cost  of  a  MLA  of  G. 
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A  solution  (  to  (4.1-  4.2)  is  a  spreading  metric  for  the  MLA  (see  [13]).  Let  V  denote 
the  set  of  all  nontrivial  connected  subgraphs  of  V. 


^(e)  >  0,  Ve  G  E 


(4.1) 

(4.2) 


where  dist(u,  v)  is  the  length  of  a  shortest  path  from  utov  according  to  the  lengths  £(e). 
The  metric  £  can  be  computed  in  polynomial  time  (see  [13])  using,  e.g.,  the  ellipsoid 
method  (There  may  be  an  exponential  number  of  constraints  in  (4.1)).  Note  that  (4.1) 
actually  implies  that  £(e)  >  1,  for  all  e  in  £;  (simply  consider  the  subsets  U  that  consist  of 
a  single  edge  and  its  endpoints). 

A  solution  £*  to  (4.1—  4.2)  that  minimizes  J^eeE  w(e)£{e)  is  a  lower  bound  on  the  cost 
of  a  MLA,  since  for  any  hnear  ordering  a  of  the  nodes  of  G,  the  assignment  of  lengths 
to  the  edges  of  G  given  by  £{i,j)  =  |<T(i)  -  a{j)\  satisfies  (4.1-  4.2).  The  volume  of 
such  an  assignment  is  exactly  the  cost  of  a.  In  particular  this  is  true  for  a  MLA  a.  Hence 
W*  =  J2eeE  w{e)£*{e)  is  less  than  or  equal  to  the  cost  of  a  MLA  (Note  that  this  lower 
bound  is  existentially  tight,  since  there  exist  instances  of  this  problem  such  that  £*(i,j)  = 
\a{i)  —  <T(j)|,  where  cr  is  a  MLA  of  G,  as  for  example,  when  C  is  a  linear  array.).  We 
will  use  this  fact  later,  when  proving  Theorems  5  and  6.  Figure  4.2  illustrates  such  an 
assignment  of  lengths  for  the  linear  arrangement  a  given  by  the  ordering  of  the  nodes  of 
G  from  left  to  right  in  this  figiu*e  (the  lengths  £{i,j)  are  the  numbers  associated  with  the 
edges  in  that  picture;  without  loss  of  generality,  assume  that  all  the  edge  weights  are  1). 

Let  £  be  a  spreading  metric  of  volume  W  =  Sees  w(e)£(€)  that  satisfies  (4.1-  4.2).  In 
the  remainder  of  this  chapter,  all  the  distances  in  G  are  measured  with  respect  to  £. 


4.4  The  algorithm 

We  now  present  our  O(log  n)-approximation  algorithm  for  the  MLA  problem  on  general 
graphs.  Let  G{V,  E)  be  a  graph  with  nonnegative  edge  weights  w{e).  Assume  without 
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linear  arrangementa  of  G :  a  c  e  d 


=  'ZKUj)  =  12 


Figure  4.2:  An  assignment  of  lengths  to  the  edges  of  G. 

loss  of  generality  that  G  is  connected  (otherwise  consider  each  connected  component  of  G 
separately),  and  that  all  the  edge  weights  w(e)  are  at  least  1. 

In  this  paragraph,  we  introduce  the  notion  of  a  level  according  to  L  Fix  a  node  v  in  V. 
An  edge  {x,  y)  is  at  level  i  with  respect  to  v  if  and  only  if  dist(r’,  x)  <  i  and  dist(v,  y)  >  i, 
for  any  i  e  N.  Note  that  an  edge  may  be  at  more  than  one  level,  and  that  there  may  be 
edges  that  are  not  at  any  level.  Let  the  weight  of  level  i,  denoted  by  pi,  be  the  sum  of  the 
weights  of  the  edges  at  level  i.  Without  loss  of  generality,  we  will  assume  that  log  W  is  an 
integer.  Let  qa-  =  2^’,  for  all  k  in  [(log  W)  +  1].  Level  i  has  index  k,  k  in  [log  W],  if  and 
only  if  Pi  belongs  to  the  interval  h  =  (q  a,  O/t+i]. 

It  follows  from  (4.1)  that  there  exists  a  node  u  such  that  dist(w,u)  >  n/4.  Hence 
we  have  at  least  n/4  distinct  levels  with  nonzero  weight.  Note  that  since  iv{e)  >  1  and 
>  1,  for  all  e,  any  level  with  nonzero  weight  must  have  weight  at  least  1.  Since  there 
are  log  W  distinct  level  indices,  there  are  at  least  n/ (4  log  14')  levels  with  same  index  k,  for 
some  k.  Let  k  be  the  exact  number  of  levels  of  index  k.  Figure  4.3  illustrates  the  algorithm 
and  charging  scheme  described  below. 

In  a  recursive  step  of  the  algorithm,  we  cut  along  the  sequence  of  k  levels  of  index  k 
—  i.e.,  we  remove  all  the  edges  that  are  at  at  least  one  of  those  levels,  even  if  they  also 
are  at  some  other  level  of  index  different  from  k.  For  all  i,  let  level  a,  be  the  ith  level  of 
index  k,  in  increasing  order  of  distances  to  v.  Let  Hi  be  the  subgraph  induced  by  the  nodes 
that  are  at  distance  greater  than  a,  and  less  than  or  equal  to  from  v;  let  Hq  (resp.,  H^) 
be  the  subgraph  induced  by  the  nodes  that  are  at  distance  less  than  or  equal  to  ai  (resp.. 
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Figure  4.3:  The  algorithm  and  charging  scheme. 

greater  than  a from  v.  Let  rii  denote  the  number  of  nodes  in  Hi.  We  recurse  on  each  Hi, 
obtaining  a  linear  arrangement  ctj  for  the  n,  nodes  in  this  subgraph.  We  combine  the  linear 
arrangements  obtained  for  the  H’s,  obtaining  a  linear  arrangement  a  for  G,  as  follows: 

(<^(1)7  •  •  •  7  *^(^))  (^0(1  )7  •  •  •  7  *^o(no) 7  ■  •  •  7  ^k(1)7  •  •  •  7  ^k(^k) ) 

Each  recursive  step  runs  in  polynomial  time;  at  each  recursive  step,  we  decompose  a 
connected  component  into  at  least  two  connected  components.  Hence  the  algorithm  runs 
in  polynomial  time. 

We  use  a  charging  scheme  to  account  for  the  length  of  an  edge  e  in  the  linear  arrange¬ 
ment  for  G  obtained  by  our  algorithm  (note  that  we  account  for  the  length  of  the  edge  in  the 
linear  arrangement,  rather  than  for  the  spreading  metric  length  of  the  edge).  If  some  edge 
e  in  level  ai  belongs  to  some  other  level  of  index  k,  say  level  aj,  then  this  edge  also  belongs 
to  every  level  of  index  k  between  ai  and  O  j.  Without  loss  of  generality,  assume  that  i  <  j. 
Edge  e  will  be  “stretched  over”  all  the  nodes  in  /f,  U  . . .  U  Hj-i,  and  possibly  over  some 
of  the  nodes  in  Hi-i  and  Hj,  in  the  linear  arrangement  produced  by  our  algorithm.  Hence 
the  length  of  such  an  edge  in  the  final  linear  arrangement  will  be  at  most  n^-i  +  ..  .  +  nj. 
Suppose  we  charge  rip-i  -1-  Up  for  the  portion  of  the  edge  that  is  stretched  over  the  nodes  in 
Hp-i  U  Hp,  when  considering  level  Cp,  for  all  p  in  [i,j  + 1].  Then  the  total  charge  associated 
with  edge  e  is  equal  to  ni_i  +  2{ni  .  nj_i )  +  rij  >  ni_i  -1- . . .  -h  —  that  is,  edge  e 
will  be  charged  at  least  its  length  in  the  final  linear  arrangement. 
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We  will  now  compute  an  upper  bound  on  the  cost  of  the  linear  arrangement  obtained 
by  our  algorithm.  Let  C{Z)  he  the  maximum  cost  of  a  linear  arrangement  obtained  by  our 
algorithm  for  a  subgraph  of  G  whose  volume  of  the  spreading  metric  £  is  at  most  Z.  Since 
the  sum  of  the  weights  of  all  edges  in  level  Cj  is  pa, ,  and  since  we  charge  for  the  length  of  an 
edge  as  described  in  the  preceding  paragraph,  we  derive  the  following  recurrence  relation 
for  C{W): 

C{W)  <  C(W  -'£p.,)  +  +  n.)] 

i—1  i=l 

We  now  show  that  C{W)  =  0{W  log  n).  We  first  prove  the  following  lemma: 

Lemma  4.4.1  C{W)  <  cW  log  W,  for  some  constant  c. 


Proof:  We  will  use  induction  on  W.  Our  base  case  for  the  induction  will  be  the  case 
14^  <  0.  We  can  use  induction  on  W  here  since,  for  any  subgraph  of  G  on  a;  nodes  whose 
volume  of  the  spreading  metric  £  is  at  most  Z  (Z  <  W), 

Qa-'t  ^  ClkX  ^  1 

4  log  Z  ~  4  log  W  —  4  log  W 

That  is,  the  recursive  relation  above  will  converge  to  the  base  case  in  at  most  414^  log  14^ 
steps. 


The  base  case  14^  <  0  corresponds  to  a  totally  disconnected  graph  (a  graph  with  no 
edges);  therefore  G(14^)  =  0.  If  14^  >0  then 


C{W)  <  C{W 


Oikn 


<  4M'  -  T^lloelW’  -  tSfI  + 


<  c[14'"  - 


41ogl4'^ 

Oikn 


4  log  W 

:W 

<  cW  log  W 


4  log  14  ’ 
]  log  14'^  +  2ak+in 


<  cW'log  14^  +  ak+in[2  -  -] 

o 


for  a  sufficiently  large  constant  c  (c  >  16).  The  second  step  follows  from  the  induction 
hypothesis,  and  the  fourth  step  follows  since  qa-+i  =  2q  a. 


Q.E.D. 
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We  still  need  to  show  how  to  bring  the  approximation  factor  down  from  (9(log  W)  to 
(9(log  n).  We  will  do  this  by  using  standard  techniques  of  rescaling  and  rounding  down  the 
edge  weights  (as  in  [15]). 

Our  goal  will  be  to  reduce,  by  rescaling  and  rounding  down  weights,  our  original  input 
graph  G  to  an  “equivalent”  input  graph  G'  whose  spreading  metric  volume  is  a  polynomial 
in  n.  Consider  the  set  E'  of  edges  e  such  that  w[e)  <  Wl{mn).  Since  an  edge  has  length 
at  most  n  in  any  linear  arrangement  for  G,  the  contribution  of  the  edges  in  E'  to  a  MLA  of 
G  is  at  most  W.  Suppose  we  delete  all  those  edges,  and  apply  a  ^-approximation  algorithm 
to  the  resulting  graph.  We  thus  obtain  a  linear  arrangement  of  G  —  by  simply  adding  those 
edges  back  into  the  linear  arrangement  found  —  with  cost  that  is  within  a  factor  of 

the  cost  of  a  MLA  of  G. 

We  now  round  down  each  weight  w{e),  for  all  e  in  \  E',  to  its  nearest  multiple  of 
»<■/(  mn).  The  error  incurred  by  this  rounding  procedure  is  again  at  most  W.  Furthermore, 
we  scale  the  rounded  weights  by  Wf  (mn),  obtaining  new  weights  for  the  edges  that  are 
all  integers  in  the  interval  [0,  mn].  Note  that  we  have  only  changed  the  units  in  which  the 
weights  are  expressed.  Hence  we  obtain  a  pol5momial  time  (/t)-l-2)-approximation  algorithm 
for  the  MLA  problem  on  G  with  weights  w{e),  by  solving  the  MLA  problem  on  G'  =  G\E' 
with  integral  weights  that  belong  to  [0,  mnj.  The  volume  W  of  the  spreading  metric  for 
G'  is  at  most  a  polynomial  in  n.  By  Lemma  4.4.1,  we  have  C{W')  <  cW'\og{W')  = 
c'W  log  n,  for  some  constant  c'.  Rescaling  the  edge  weights  back  by  multiplying  C{W') 
by  "ivy  (mn),  we  obtain  a  MLA  for  the  original  weights  on  G  with  cost  at  most  dW  log  n. 

Finally,  we  can  choose  t  such  that  t  satisfies  (4.1-  4.2),  and  whose  volume  W*  mini- 
mizes  w{e)£{e),  over  all  spreading  metrics  i  that  satisfy  (4.1-  4.2).  As  we  have  seen, 
11  *  is  a  lower  bound  on  the  cost  of  a  MLA.  Hence,  by  Lemma  4.4.1  and  the  considerations 
that  follow  this  lemma,  we  have  proved  the  following  theorem: 


Theorem  5  The  cost  of  a  solution  to  the  MLA  problem,  obtained  by  our  algorithm  for  the 
spreading  metric  t  on  G  is  within  an  O(log  n)  factor  times  the  cost  of  a  MLA  of  G. 
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4.5  Graphs  with  excluded  minors 

In  this  section  we  show  how  to  obtain,  in  polynomial  time,  an  (9(log  log  n) -approximation 
bound  for  the  MLA  problem  on  a  graph  G  with  no  fixed  /C.r -minors  —  e.g.,  on  a  planar 
graph  G.  We  denote  the  r  x  r  complete  bipartite  graph  by  AV,r. 

Definition  4.5.1  Let  H  and  G  be  graphs.  Suppose  that  (i)  G  contains  disjoint  connected 
subgraphs  Ay,  for  each  node  v  of  H;  and  that  (ii)  for  every  edge  {u,v)  in  H,  there  is  a 
path  P(u,v)  in  G  with  endpoints  in  Ay  and  Ay,  such  that  any  node  in  P(u,v)  other  than  its 
endpoints  does  not  belong  to  any  Ay,,  w  in  H,  nor  to  any  P{i,j),  {i,j)  in  H  \  {u,  v).  Then 
UyAy  is  said  to  be  an  H -minor  ofG. 

Klein,  Plotkin,  and  Rao  [23],  showed  how  to  decompose  (in  polynomial  time)  a  graph 
with  no  /v,r -minors  into  connected  components  of  small  diameter.  In  our  application,  this 
implies  that  each  connected  component  has  at  most  a  constant  fraction  of  the  nodes  in  G, 
as  shown  in  the  next  section. 

4.5.1  The  algorithm 

We  recursively  solve  the  problem,  as  we  do  in  the  general  case.  We  combine  the  partial 
solutions  returned  by  each  recursive  step,  and  charge  for  each  edge  removed  at  a  cut  step 
in  the  same  way  as  in  the  algorithm  of  Section  4.4.  It  is  in  the  way  we  decompose  the 
graph  before  a  recursive  step  that  the  algorithm  of  Section  4.4  differs  considerably  from 
the  one  presented  in  this  section.  Before  each  recursive  step,  we  may  perform  a  series 
of  shortest  path  levelings,  to  be  defined  soon,  on  each  induced  connected  subgraph,  until 
we  can  guarantee  that  the  original  graph  has  been  decomposed  into  subgraphs  that  contain 
at  most  a  fixed  fraction  (strictly  less  than  one)  of  the  nodes  each.  In  the  algorithm  of 
Section  4.4,  we  always  perform  only  one  shortest  path  leveling  before  each  recursive  step. 

The  algorithm  proceeds  in  rounds.  In  each  round  we  have  a  cut  step,  which  corresponds 
to  the  series  of  cuts  performed  during  the  round,  and  a  recursive  step,  which  consists  of 
recursing  on  the  connected  components  that  result  from  the  cut  step.  Let  G{V,  A)  be  a 
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graph  on  n  nodes  that  excludes  Kr,r  as  a  minor,  for  some  fixed  r  >  0.  Let  £  be  a  spreading 
metric  for  G  of  volume  W  that  satisfies  (4.1  -  4.2). 

A  cut  step  in  G  will  produce  a  series  of  subgraphs  of  G,  G  =  Go, ...  ,Gut  <  r,  where 
each  Gi+i  results  from  a  shortest  path  leveling  of  Gi.  Fix  a  node  v  in  Gi.  A  shortest  path 
leveling  (SPL)  of  Gi  rooted  at  v  consists  of  an  assignment  of  levels  to  the  edges  of  Gi  as 
follows:  An  edge  {x,  y)  is  at  level  j  of  this  SPL  if  and  only  if  dist(u,  x)  (in  Gi)  is  at  most 
j  and  dist(v,  y)  (in  Gi)  is  greater  than  j,  for  all  j  €  N.  (An  edge  may  be  at  more  than  one 
level.) 

We  will  cut  along  a  sequence  of  levels  of  the  SPL;  one  of  the  connected  components 
resulting  from  this  cut  procedure  will  be  Gi+i.  Let  n{Gi)  denote  the  number  of  nodes 
in  Gi.  Let  s  =  n/b,  where  6  is  a  constant  to  be  specified  later.  The  spreading  metric 
diameter  guarantee  implies  that  this  SPL  has  at  least  n{Gi)/4  levels.  We  will  see  later 
that  n{Gi)  =  0(n),  and  that  we  can  choose  b  such  that  n{Gi)/4  >  2s  (we  need  b  >  8). 
We  group  the  levels  of  this  SPL  into  bands  of  2s  consecutive  levels  as  follows:  Alternate 
coloring  the  bands  “blue”  and  “red”,  in  increasing  order  of  the  levels.  Without  loss  of 
generahty,  assume  that  the  subgraph  induced  by  the  blue  bands  has  at  least  n(G',  )/2  nodes. 
We  have  2s  cuts  of  the  following  type:  For  0  <  j  <  2s  —  1,  a  leveled  cut  j  consists  of  all 
the  edges  in  the  jth  level  (with  respect  to  distance  from  u)  of  every  red  band.  That  is,  if  the 
band  consisting  of  the  first  2s  levels  is  colored  blue,  then  the  leveled  cut  j  consists  of  the 
levels  2s  +  j,  6s  +  j, . . .,  for  all  j. 

Now  we  group  the  leveled  cuts  according  to  their  indices.  Let  (3k  =  W2^f{s  log  n), 
for  all  integer  A:  in  [l,21oglogn].  Let/?o  =  0  and /3(2iogiogn)  =  The  weig/it  of  leveled 
cut  j  is  the  sum  of  the  weights  of  the  levels  in  the  cut  (the  weight  of  a  level  being  the  sum 
of  the  weights  of  the  edges  at  that  level).  Leveled  cut  j  has  index  k,  for  all  integer  k  in 
[2  log  log  n],  if  and  only  if  the  weight  of  cuij  belongs  to  the  interval  h  —  {(3k,  (3k+i].  There 
are  at  least  2s/ (2  log  logn)  leveled  cuts  with  same  index  ki  (since  there  exist  at  least  2s 
distinct  leveled  cuts). 

If  ki  >  0,  then  we  cut  along  these  at  least  s/ (log logn)  leveled  cuts  of  index  ki,  and 
recurse  on  the  resulting  connected  components.  In  this  case,  we  let  t  =  i,  and  the  cut 
step  of  this  round  is  complete.  Otherwise,  we  first  cut  along  only  one  of  the  leveled  cuts 
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of  index  ki  =  0  (chosen  arbitrarily).  Then  we  check  whether  there  exists  a  connected 
component  Gi^i  of  Gi  with  more  than  n(Gi)/2  nodes.  In  case  no  such  component  exists, 
we  let  /  =  /  (the  cut  step  of  this  round  is  complete),  and  we  recurse.  If  i  =  r  —  1,  we  also  let 
t  =  i  =  r-1  and  recurse.  Otherwise,  we  proceed  by  performing  a  SPL  on  G,+i ,  following 
the  procedure  just  described,  with  i  =  i  1 . 

The  number  of  nodes  in  Gi,  n{Gi),  is  proportional  to  n,  for  all  i  in  [r].  This  follows 
since  i>{Gi+i )  >  n{Gi)l2,  for  all  i,  by  the  choice  of  and  since  r  is  a  constant. 

Suppose  we  just  performed  a  series  of  r  SPL’s  and  corresponding  cut  procedures.  The 
last  cut  performed,  on  Gr-i,  generated  a  collection  of  connected  components  of  Gr-i. 
Klein,  Plotkin,  and  Rao  [23]  proved  that  the  distance  in  G  between  any  pair  of  nodes  in 
any  such  component  is  0(s)  (where  the  constant  in  the  0(  )  notation  depends  only  on  r). 
Thus,  for  a  suitably  chosen  constant  6,  we  can  ensure  that  the  distance  between  any  pair  of 
nodes  is  at  most  77/6,  in  any  such  component. 

It  follows  from  the  result  by  Klein,  Plotkin,  and  Rao  that  any  connected  component 
that  results  from  this  cut  step  has  at  most  27?/3  nodes,  as  we  now  show.  Fix  any  node 
</  in  G.  It  follows  from  (4.1),  that  any  subgraph  of  G  on  (77  —  .r)  nodes  that  contains  u 
has  a  node  at  distance  at  least  (7?  —  .r)/4  from  u.  Suppose  we  stzut  with  the  graph  G, 
and  proceed  by  removing  one  node  at  a  time,  choosing  always  a  node  that  has  maximum 
distance  to  n  among  the  remaining  nodes.  Thus,  we  need  to  remove  at  least  one-third  of 
the  nodes  before  we  are  left  only  with  nodes  that  are  within  distance  77/6  from  u  in  G. 
This  implies  that  any  resulting  connected  component  of  GV-i  has  at  most  2?7/3  nodes.  Any 
other  resulting  connected  component  (of  G  \  GV-i)  has  at  most  77/2  nodes,  by  the  choice 
of  the  G'i  ’s. 

We  distinguish  between  two  types  of  cut  steps:  if  kt  =  0,  then  we  call  the  cut  step  in 
this  round  a  cut  step  with  reduction  in  size',  otherwise  kt  >  0,  and  we  call  the  cut  step  in 
this  round  a  cut  step  with  reduction  in  volume.  Note  that  kt  =  0  implies  kj  =  0,  for  all  j  in 
[/]• 

Let  C(Z,  x)  denote  the  maximum  cost  of  a  Unear  arrangement  obtained  by  our  algo¬ 
rithm  for  a  subgraph  of  G  with  x  nodes,  whose  volume  of  the  spreading  metric  f  is  at  most 
Z. 
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Lemma  4.5.1  C{W,n)  <  cW  log  log  n,  for  some  constant  c. 


Proof:  We  use  induction  on  n  and  W :  We  apply  induction  on  n  whenever  we  have  a  cut 
step  with  reduction  in  size,  and  we  use  induction  on  W  whenever  we  have  a  cut  step  with 
reduction  in  volume.  Our  base  cases  will  be  the  cases  when  n  <  1  or  <  0.  When  we 
have  a  cut  step  with  reduction  in  volume,  for  a  subgraph  of  G  on  a;  nodes  whose  volume  of 
the  spreading  metric  i  is  at  most  Z  (Z  <  W),  the  reduction  in  volume  in  that  cut  step  is  at 
least 

^  ^  1 
61ogloga:  ~  61oglogn  ~  61oglogn 

Thus,  at  every  recursive  step,  we  either  reduce  the  volume  of  the  spreading  metric  in  the  re¬ 
maining  subgraph  by  at  least  1/(6  log  log  n)  or  we  decompose  the  graph  into  more  than  one 
connected  components,  all  of  which  have  at  most  a  2/3-fraction  of  the  nodes.  Hence,  the  in¬ 
ductive  process  will  converge  to  one  of  the  base  cases  in  at  most  max{bW  log  log  n,  (9(log  n)) 
steps  (since  we  can  have  at  most  614^  log  log  n  cut  steps  with  reduction  in  volume  and  at 
most  O(log  n)  cut  steps  with  reduction  in  size). 

The  base  cases  for  IF  <  0  or  =  1  are  trivial.  Suppose  we  perform  a  cut  step  with 
reduction  in  size.  Let  the  connected  components  resulting  from  this  step  be  Ho,. ..  ,Hp. 
Then,  since  we  cut  along  r  leveled  cuts  of  weight  at  least  IF/ (5  log  n)  (we  over-charge  n 
for  the  cost  of  each  occurrence  of  an  edge  in  each  of  these  leveled  cuts) 


C{W,n)  < 
< 

< 

< 

< 


^C{Wi,  Ui)  F  r—j - n 

i=o  ^  ^ 

p 


2brW 


i:c(lVi)loglog(2n/3)  + 

i=0  ^ 

Tx.,  ,  cW  2brW 

cW  log  log  n  -  — - 1-  - - 

o  log  n  log  n 

lF(c/3  -  2br) 


cW  log  log  n 
clF  log  log  n 


logn 


where  IF  and  Ui  are  the  volume  and  number  of  nodes,  respectively,  associated  with  com¬ 
ponent  Hi,  We  have  shown  that  every  ni  is  at  most  2n/3.  The  second  step  foUows  by 
induction  and  since  ^  Note  that  log  log(2n/3)  <  log  log  n  -  1/(3  log  n),  and 
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thus  the  third  step  follows.  The  last  step  follows  for  a  sufficiently  large  constant  c  (e.g., 
c  >  6br). 

If  the  cut  step  performed  was  with  reduction  in  volume,  then  we  performed  a  series 
of  t  <  r  SPL’s  and  respective  cut  procedures.  The  last  term  on  the  right-hand  side  of  the 
first  inequality  below  accounts  for  the  first  (^  -  l)th  leveled  cuts  used.  The  second  term  on 
the  right-hand  side  of  that  inequality  accounts  for  the  t\h  leveled  cut  used.  The  charging 
scheme  for  the  edges  removed  in  the  tth  leveled  cut  of  this  cut  step  is  analogous  to  the 
scheme  presented  in  Section  4.4. 

C{W,n)  <  +  (r  - 

log  log  7?  \s  log  n 

<C(H''-p^^,n)  +  (r  +  3)An 

log  log  7? 

<  c{W  -  - — - )  log  log  n  -f  (r  -I-  3)/?A-n 

log  log  7? 

<  cW  log  log  n  -f  /3,n  (r-h  3- 

0  log  log  77 

<  c\V  log  log  77 

when  c  >  6(r  -f  3).  The  second  step  above  follows  from  /3/,  >  2H7(.s  log  ??),  and  from 
=  2/3/c,  0  <  A;  <  2  log  log  n  —  1 ;  the  third  step  follows  by  induction. 

Q.E.D. 

As  in  Section  4.4,  we  choose  a  spreading  metric  (*  that  satisfies  (4.1  -  4.2),  and  whose 
volume  W*  is  a  lower  bound  on  the  cost  of  a  MLA  of  G.  By  Lemma  4.5.1,  we  obtain  the 
following  theorem: 

Theorem  6  Given  a  graph  G  on  77  nodes  that  excludes  fixed  Kr^r -minors,  the  cost  of  a 
solution  to  the  MLA  problem,  obtained  by  the  algorithm  presented  in  this  section  for  the 
spreading  metric  t,  is  within  an  0(log  \0g77)  factor  times  the  cost  of  a  MLA  ofG. 

4.6  Minimum  storage-time  product 

In  this  section,  we  sketch  our  approach  to  approximating  the  storage-time  product  for  a 
directed  acyclic  graph  G{V,  E).  The  minimum  storage-time  product  problem  arises  in  a 
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Figure  4.4:  A  minimum  storage-time  product  of  G. 

manufacturing  or  computational  process,  in  which  the  goal  is  to  minimize  the  storage-time 
product  of  the  process:  We  want  to  minimize  the  use  of  storage  over  time,  assuming  storage 
is  an  expensive  resource.  Let  G(V,  E)  be  an  acyclic  directed  graph  on  n  nodes  with  edge 
weights  w(e),  for  all  e  in  E,  and  node  weights  r(u),  for  all  v  in  V.  The  nodes  of  G  represent 
tasks  to  be  scheduled  on  a  single  processor.  The  time  required  to  process  task  v  is  given  by 
r(u).  The  weight  on  edge  (u,  u),  w{u,  v),  represents  the  number  of  units  of  storage  required 
to  save  the  intermediate  results  generated  by  task  u  until  they  are  consumed  at  task  v.  The 
minimum  storage-time  product  problem  consists  of  finding  a  topological  ordering^  of  the 
nodes  a  :V  — )•  {1, . . . ,  n}  that  minimizes 

(i,3)eE,a(i)<a(j)  \  [k  :  cr{i)<a{k)<(T(j) 

Figure  4.4  illustrates  a  topological  ordering  of  the  nodes  of  G  (given  from  left  to  right  on 
the  rightmost  representation  of  the  graph)  with  minimum  storage-time  product  of  26. 

This  problem  generalizes  the  MLA  problem:  When  all  tasks  have  unit  execution  time, 
it  becomes  a  directed  version  of  the  MLA  problem.  It  is  also  a  generalization  of  the  single- 

^  An  ordering  <r  of  the  nodes  of  G  (where  G  is  an  acyclic  graph)  is  said  to  be  topological  if  and  only  if  for 
every  (i,  j)  €  E,  <r(i)  <  a{j). 
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processor  scheduling  problem,  if  we  are  minimizing  the  weighted  sum  of  completion  times 
(this  problem  is  NP-complete  [16,  problem  SS13,  page  240]). 

We  use  a  spreading  metric  defined  as  follows  (see  [13]).  Let  =  {(u,  u)  |(v,  u)  e  E). 
We  define  G'  =  {V,E  E^).  Let  V  denote  the  set  of  all  nontrivial  strongly  connected 
subgraphs  of  G' .  Find  i*  :  E  ^  Q  that  minimizes  while  satisfying  the 

constraints 

v„et/,Vf/€V 

£{hj)  >  r{i)  +  r(i),  V(i,  j)  €  E 

where  5{u,  u)  =  dist(u,  v)  +  dist(t>,  u).  Here  we  define  dist(t/,  v)  to  be  the  length  of  a 
shortest  path  from  u  to  v  in  G'  according  to  the  lengths  f(e)  for  e  in  and  where  each  e 
in  E^  has  length  0. 

For  any  hnear  ordering  a  of  V,  the  assignment  of  lengths  to  the  edges  given  by  £(i,  j)  = 
T,k  :  <7(!)<ct(A:)<(T(j) for  all  {i,j)  in  E,  satisfies  the  constraints  above.  Thus  the 
volume  W*  of  the  spreading  metric  t  is  a  lower  bound  on  the  optimal  cost  of  a  solution  to 
the  storage-time  product  problem. 

Given  the  spreading  metric  constraints  above  we  can  apply  the  algorithm  of  Section  4.4 
to  this  problem  as  follows.  Let  T  =  There  is  a  node  v  such  that  either  the 

out-tree  or  the  in-tree  rooted  at  v  has  depth  U{T).  Thus,  we  can  find  a  sequence  ai , . . . , 
of  K  =  fl(T'/log  IF)  levels  whose  weights  —Pa^  are  within  a  factor  of  two  of  each 
other  (as  in  Section  4.4. 

Laying  out  the  resulting  pieces  successively,  we  obtain  a  solution  where  the  cost  is 
bounded  by 


c(w)  <  c{w  -  f;  (.„)  +  +  T.)i, 

j=i  j=i 

where  r,  is  the  sum  of  t{v)  over  all  nodes  v  that  lie  between  levels  a,_i  and  a,  (tq  and 
are  defined  accordingly). 

This  recursion  can  be  upper  bounded  by  0{W  log  IF),  as  in  Section  4.4.  This  cost  can 
be  reduced  to  0{W  log  T)  using  the  standard  techniques  that  were  used  in  Section  4.4  to 
reduce  0(log  IF)  to  0(log  n). 
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4.7  Minimum  containing  interval  graph 

In  this  section,  we  sketch  our  approach  to  approximating  the  cost  of  a  Tninimnm  containing 
interval  graph  of  a  graph  G{V,E).  We  first  introduce  interval  graphs.  An  interval  graph 
is  a  graph  whose  vertices  can  be  mapped  to  distinct  intervals  in  the  real  line  such  that  two 
vertices  in  the  graph  have  an  edge  between  them  if  and  only  if  their  corresponding  intervals 
overlap.  A  completion  of  a  graph  G  into  an  interval  graph  results  in  an  interval  graph  with 
same  node  set  as  G  that  contains  G  as  a  subgraph. 

We  use  the  following  characterization  of  interval  graphs,  due  to  [44].  An  undirected 
graph  G(V,  E)  on  n  nodes  is  an  interval  graph  if  and  only  if  there  exists  a  linear  ordering 
<T  :  V  — {1, . . . ,  n}  of  the  nodes  in  V  such  that  if  an  edge  (u,  v)  is  in  E,  where  a{u)  < 
a{v),  then  every  edge  {u,w),  for  w  such  that  <r{u)  <  a{w)  <  a{v),  also  belongs  to  E. 
This  characterization  implies  that,  for  any  given  ordering  cr,  there  exists  a  unique  way  of 
completing  G  into  an  interval  graph  by  adding  as  few  edges  to  G  as  possible. 

The  cost  of  a  completed  graph  is  given  by  the  total  number  of  edges  in  the  (completed) 
graph.  This  cost  can  be  viewed  as  the' sum  over  vertices  of  the  maximum  backward  stretch 
of  the  vertex  —  i.e.,  of  the  distance  to  the  farthest  lower-numbered  node  to  which  the  vertex 
is  connected.  This  is  very  similar  to  the  MLA  problem,  except  that  the  nodes  are  stretched 
along  the  order  rather  than  the  edges  (see  [13]).  Thus,  our  techniques  also  apply  to  this 
problem. 

This  problem  arises  in  several  areas,  from  computer  science,  to  biology  (see  [34]),  to 
archaeology  (e.g.,  when  finding  a  consistent  chronological  model  for  tool  use  while  making 
as  few  assumptions  as  possible  [22]). 

The  spreading  metric  t*  that  we  use  (due  to  [13])  assigns  lengths  to  the  nodes  of  the 
graph,  rather  than  to  its  edges,  as  in  the  minimum  linear  arrangement  and  in  the  Tninimiim 
storage-time  product  problems.  Let  V  denote  the  set  of  all  nontrivial  connected  subgraphs 
ofG.  Themetricf* isafunctionf  :  V  ->■  Qthatminimizes(X;v€V^(^^))/2,whilesatisfying 
the  constraints 

^  dist(u,v)  >  j{\u\^  -  1),  \/ueU,WeV 
£{v)  >0,  VveV 
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where  dist(u,  u)  is  the  shortest  length  —  given  by  [<?(w)  +  —  of  ^  P^th 

u,xo,...,  Xp,  V,  x-i  €  V,  from  u  to  v  in  G. 

Let  G'{V,  E')  be  a  completion  of  G  into  an  interval  graph.  If  we  let  i[v)  be  the  degree 
of  node  v  in  G',  the  cost  ^(^))/2  clearly  gives  the  number  of  edges  in  E'.  Also  this 

assignment  of  lengths  to  the  nodes  satisfies  the  constraints  above.  Hence  the  volume  W* 
of  the  metric  t  is  a  lower  bound  on  the  number  of  edges  in  a  minimum  containing  interval 
graph  of  G. 

The  recurrence  relations  that  bound  the  cost  of  a  solution  obtained  for  the  minimum 
containing  interval  graph  problem  are  analogous  to  the  ones  for  the  MLA  problem,  both 
for  the  general  case  and  for  the  excluded  AV,r -minors  case. 

4.8  Conclusion 

We  provided  an  existentially  tight  bound  on  the  relationship  between  the  spreading  metric 
cost  volumes  and  the  true  optimal  values  for  the  problems  of  minimum  hnear  arrangement, 
minimum  containing  interval  graph,  and  minimum  storage-time  product. 

It  would  be  interesting  to  extend  our  techniques  to  obtain  O(log  n) -approximation  al¬ 
gorithms  for  other  problems.  In  particular,  it  seems  natural  to  extend  our  techniques  to 
improve  the  best  known  approximation  factors  for  other  problems  that  satisfy  the  “approx¬ 
imation  paradigm”  of  [13].  We  would  then  provide  an  existentially  tight  bound  —  on  the 
ratio  between  the  value  of  an  optimal  solution  and  the  spreading  metric  volume  —  for  any 
such  problem. 

However,  since  the  approach  used  here  depends  on  the  structure  of  graph  ordering  prob¬ 
lems,  new  ideas  may  be  required. 
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